-Re: Proper use of ConsumerConnector
Joel Koshy 2012-12-20, 18:06
“unless you have a good reason to load balance and manage offsets manually”
> In general one consumer connector consumes more than one partition.
> In client side, we want to get all partitions offset for any message, if
> crash happens(some message is fetched from kafka but the result is not
> flushed to disk)
> happens we can use offset info to rewind kafka consumer.
> Do you think this is a good reason to use SimpleConsumer rather than
An alternative to using simpleconsumer in this use case is to use the
zookeeper consumer connector and turn off auto commit. After your consumer
process is done processing a batch of messages you can all commitOffsets -
the main caveat to be aware of is that if your consumer processes batches
very fast you would write to zookeeper that often - so in fact setting an
autocommit interval and being willing to deal with duplicates is almost
equivalent. KAFKA-657 would help I think - since once that API is available
you can store your offsets anywhere you like.
> On 12-12-20 上午3:16, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
> >In general, you should use the consumer connector - unless you have a good
> >reason to load balance and manage offsets manually (which is taken care of
> >in the consumer connector).
> >- Does the ConsumerConnector manage connections to multiple brokers,
> >> or just a single broker?
> >Multiple brokers.
> >> - Does the ConsumerConnector require a thread for each partition on
> >> each broker? (If not, how many threads does it require?)
> >You can specify how many streams you want - if there are more partitions
> >than threads, then a given thread can consume from multiple partitions. If
> >there are more threads than available partitions, there will be idle
> >> - Does the ConsumerConnector use actual asynchronous IO, or does it
> >> mimic it by using a dedicated behind-the-scenes thread (and the
> >> traditional java socket API)?
> >The consumer connector uses SimpleConsumers for each broker that it
> >connects to. These consumers fetch from each broker and insert chunks into
> >blocking queues which the consumer iterators then dequeue.