Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Proper use of ConsumerConnector


Copy link to this message
-
Re: Proper use of ConsumerConnector
Joel Koshy 2012-12-20, 18:06
“unless you have a good reason to load balance and manage offsets manually”
>
> In general one consumer connector consumes more than one partition.
> In client side, we want to get all partitions offset for any message, if
> crash happens(some message is fetched from kafka but the result is not
> flushed to disk)
> happens we can use offset info to rewind kafka consumer.
>
> Do you think this is a good reason to use SimpleConsumer rather than
> ConsumerConnector?
An alternative to using simpleconsumer in this use case is to use the
zookeeper consumer connector and turn off auto commit. After your consumer
process is done processing a batch of messages you can all commitOffsets -
the main caveat to be aware of is that if your consumer processes batches
very fast you would write to zookeeper that often - so in fact setting an
autocommit interval and being willing to deal with duplicates is almost
equivalent. KAFKA-657 would help I think - since once that API is available
you can store your offsets anywhere you like.

Joel
>
> On 12-12-20 上午3:16, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
>
> >In general, you should use the consumer connector - unless you have a good
> >reason to load balance and manage offsets manually (which is taken care of
> >in the consumer connector).
> >
> >
> >- Does the ConsumerConnector manage connections to multiple brokers,
> >> or just a single broker?
> >>
> >
> >Multiple brokers.
> >
> >
> >> - Does the ConsumerConnector require a thread for each partition on
> >> each broker? (If not, how many threads does it require?)
> >>
> >
> >You can specify how many streams you want - if there are more partitions
> >than threads, then a given thread can consume from multiple partitions. If
> >there are more threads than available partitions, there will be idle
> >threads.
> >
> >
> >> - Does the ConsumerConnector use actual asynchronous IO, or does it
> >> mimic it by using a dedicated behind-the-scenes thread (and the
> >> traditional java socket API)?
> >>
> >
> >The consumer connector uses SimpleConsumers for each broker that it
> >connects to. These consumers fetch from each broker and insert chunks into
> >blocking queues which the consumer iterators then dequeue.
> >
> >Joel
>
>
>