Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: a few questions from high level consumer documentation.

Copy link to this message
Re: a few questions from high level consumer documentation.
On Thu, May 9, 2013 at 12:36 AM, Rob Withers <[EMAIL PROTECTED]> wrote:

> > -----Original Message-----
> > From: Chris Curtin [mailto:[EMAIL PROTECTED]]
> > > 1 When you say the iterator may block, do you mean hasNext() may block?
> > >
> >
> > Yes.
> Is this due to a potential non-blocking fetch (broker/zookeeper returns an
> empty block if offset is current)?  Yet this blocks the network call of the
> consumer iterator, do I have that right?  Are there other reasons it could
> block?  Like the call fails and a backup call is made?

I'll let the Kafka team answer this. I don't know the low level details.
> > > b.      For client crash, what can client do to avoid duplicate
> messages
> > > when restarted? What I can think of is to read last message from log
> > > file and ignore the first few received duplicate messages until
> > > receiving the last read message. But is it possible for client to read
> log file
> > directly?
> > >
> >
> > If you can't tolerate the possibility of duplicates you need to look at
> the
> > Simple Consumer example, There you control the offset storage.
> Do you have example code that manages only once, even when a consumer for a
> given partition goes away?

No, but if you look at the Simple Consumer example where the read occurs
(and the write to System.out) at that point you know the offset you just
read, so you need to put it somewhere. Using the Simple Consumer Kafka
leaves all the offset management to you.
> What does happen with rebalancing when a consumer goes away?
Hmm, I can't find the link to the algorithm right now. Jun or Neha can you?
> Is this
> behavior of the high-level consumer group?
> Is there a way to supply one's
> own simple consumer with only once, within a consumer group that
> rebalances?
No. Simple Consumers don't have rebalancing steps. Basically you take
control of what is requested from which topics and partitions. So you could
ask for a specific offset in a topic/partition 100 times in a row and Kafka
will happily return it to you. Nothing is written to ZooKeeper either, you
control everything.

> What happens if a producer goes away?

Shouldn't matter to the consumers. The Brokers are what the consumers talk
to, so if nothing is writing the Broker won't have anything to send.

> thanks much,
> rob