Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: a few questions from high level consumer documentation.


Copy link to this message
-
Re: a few questions from high level consumer documentation.

On 5/9/13 8:27 AM, Chris Curtin wrote:
> On Thu, May 9, 2013 at 12:36 AM, Rob Withers <[EMAIL PROTECTED]> wrote:
>
>>
>>> -----Original Message-----
>>> From: Chris Curtin [mailto:[EMAIL PROTECTED]]
>>>> 1 When you say the iterator may block, do you mean hasNext() may block?
>>>>
>>> Yes.
>> Is this due to a potential non-blocking fetch (broker/zookeeper returns an
>> empty block if offset is current)?  Yet this blocks the network call of the
>> consumer iterator, do I have that right?  Are there other reasons it could
>> block?  Like the call fails and a backup call is made?
>>
> I'll let the Kafka team answer this. I don't know the low level details.
The iterator will block if there is no more data to consume. The
iterator is actually reading messages from a BlockingQueue which is fed
messages by the fetcher threads. The reason for this is to allow you to
configure blocking with or without a timeout in the ConsumerIterator.
This is reflected in the consumer timeout property (consumer.timeout.ms)
>
>
>>>> b.      For client crash, what can client do to avoid duplicate
>> messages
>>>> when restarted? What I can think of is to read last message from log
>>>> file and ignore the first few received duplicate messages until
>>>> receiving the last read message. But is it possible for client to read
>> log file
>>> directly?
>>> If you can't tolerate the possibility of duplicates you need to look at
>> the
>>> Simple Consumer example, There you control the offset storage.
>> Do you have example code that manages only once, even when a consumer for a
>> given partition goes away?
>>
> No, but if you look at the Simple Consumer example where the read occurs
> (and the write to System.out) at that point you know the offset you just
> read, so you need to put it somewhere. Using the Simple Consumer Kafka
> leaves all the offset management to you.
>
>
>> What does happen with rebalancing when a consumer goes away?
>
> Hmm, I can't find the link to the algorithm right now. Jun or Neha can you?
Down at the bottom of the 0.7 design page
http://kafka.apache.org/07/design.html
>
>
>> Is this
>> behavior of the high-level consumer group?
>
> Yes.
>
>
>> Is there a way to supply one's
>> own simple consumer with only once, within a consumer group that
>> rebalances?
>>
> No. Simple Consumers don't have rebalancing steps. Basically you take
> control of what is requested from which topics and partitions. So you could
> ask for a specific offset in a topic/partition 100 times in a row and Kafka
> will happily return it to you. Nothing is written to ZooKeeper either, you
> control everything.
>
>
>
>> What happens if a producer goes away?
>>
> Shouldn't matter to the consumers. The Brokers are what the consumers talk
> to, so if nothing is writing the Broker won't have anything to send.
>
>> thanks much,
>> rob
>>
>>
>>