Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> A problem of fault-tolerant high-level consumer group

Copy link to this message
Re: A problem of fault-tolerant high-level consumer group
If this is reproducible and you have logs that would help; in short
though, yes if you start up the replacement instance before the old
consumer instance's session is actually expired by zookeeper you could
run into rebalance exceptions (in which case you should see conflicts
in your consumer logs).


On Wed, Nov 13, 2013 at 12:12:20PM -0800, [EMAIL PROTECTED] wrote:
> I'm working on some fault-tolerant consumer group. The idea is this, to
> maximize the throughput of kafka. I request the metadata from broker and
> create #{num of partition} consumers for each topic and distribute them on
> different nodes. Moreover, there is mechanism to detect fail of any node
> and restart it.
> The problem is if I kill one of the consumer process, my program would
> detect and relaunch a new consumer with same group id and client id. But it
> would have some error(something like zookeeper entry doesn't exist, i
> didn't keep the log) and never start.
> I think the root cause is the zookeeper detect the fail of old consumer
> process, before it delete the consumer, the new consumer is coming up and
> communicate with the zookeeper, and at this time the zookeeper delete the
> entry of that consumer, and the new consumer fail to be recognized by
> zookeeper.
> The sequence is like this:
> old consumer die -> zookeeper detect -> new consumer(same groupid clientid)
> up -> zookeeper delete consumer -> new consumer find error and not
> recognized by zookeeper
> It's ok that I wont lose any data cause that data will go to other
> consumer, but it's annoying that I want to keep consumer group balanced
> after fail-over
> Thanks,
> Siyuan