Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: a few questions from high level consumer documentation.


Copy link to this message
-
Re: a few questions from high level consumer documentation.
I'll try to answer some, the Kafka team will need to answer the others:
On Wed, May 8, 2013 at 12:17 PM, Yu, Libo <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I read this link
> https://cwiki.apache.org/KAFKA/consumer-group-example.html
> and have a few questions (if not too many).
>
> 1 When you say the iterator may block, do you mean hasNext() may block?
>

Yes.
>
> 2 "Remember, you can only use a single process per Consumer Group."
>     Do you mean we can only use a single process on one node of the
> cluster for a consumer group?
>     Or there can be only one process on the whole cluster for a consumer
> group? Please clarify on this.
>
> Bug. I'll change it. When I wrote this I mis-understood the re-balancing
step. I missed this reference but fixed the others. Sorry

> 3 Why save offset to zookeeper? Is it easier to save it to a local file?
>
> 4 When client exits/crashes or leader for a partition is changed,
> duplicate messages may be replayed. "To help avoid this (replayed duplicate
> messages), make sure you provide a clean way for your client to exit
> instead of assuming it can be 'kill -9'd."
>
> a.       For client exit, if the client is receiving data at the time, how
> to do a clean exit? How can client tell consumer to write offset to
> zookeepr before exiting?
>

If you call the shutdown() method on the Consumer it will cleanly stop,
releasing any blocked iterators. In the example it goes to sleep for a few
seconds then cleanly shuts down.
>
>
> b.      For client crash, what can client do to avoid duplicate messages
> when restarted? What I can think of is to read last message from log file
> and ignore the first few received duplicate messages until receiving the
> last read message. But is it possible for client to read log file directly?
>

If you can't tolerate the possibility of duplicates you need to look at the
Simple Consumer example, There you control the offset storage.
>
>
> c.       For the change of the partition leader, is there anything that
> clients can do to avoid duplicates?
>
> Thanks.
>
>
>
> Libo
>
>