|
|
-
On SIGTERM and SIGKILL consumer ids in zookeeper do not always get cleared
Varun Vijayaraghavan 2012-11-14, 16:52
Hi!
I am a zookeeper based consumer using the brod library (which is a python library for Kafka). I stop and start consumers using a simple shell script.
I have noticed that when I kill and immediately start running ~50 consumers in a topic, some of the consumer ids in zookeeper do not get cleared. This messes up rebalancing, and partitions get assigned to zombie consumers.
Right now, I have taken the hacky approach, and wait for ~15 seconds after killing the existing consumers, before starting them.
Have you seen this issue, and have you implemented anything to handle this in the JVM library?
Additional info:
I am using kafka 0.7.2 and zookeeper 3.4.4 -- - varun :)
+
Varun Vijayaraghavan 2012-11-14, 16:52
-
Re: On SIGTERM and SIGKILL consumer ids in zookeeper do not always get cleared
Neha Narkhede 2012-11-14, 17:42
Hi,
Since you kill the consumers, it takes roughly upto zookeeper.sessiontimeout.ms time for zookeeper to detect that the consumer is gone. If you want to restart the consumers immediately, try shutting them down cleanly. (kill -15 not kill -9)
Thanks, Neha
On Wed, Nov 14, 2012 at 8:52 AM, Varun Vijayaraghavan <[EMAIL PROTECTED]> wrote: > Hi! > > I am a zookeeper based consumer using the brod library (which is a python > library for Kafka). I stop and start consumers using a simple shell script. > > I have noticed that when I kill and immediately start running ~50 consumers > in a topic, some of the consumer ids in zookeeper do not get cleared. This > messes up rebalancing, and partitions get assigned to zombie consumers. > > Right now, I have taken the hacky approach, and wait for ~15 seconds after > killing the existing consumers, before starting them. > > Have you seen this issue, and have you implemented anything to handle this > in the JVM library? > > Additional info: > > I am using kafka 0.7.2 and zookeeper 3.4.4 > > > -- > - varun :)
+
Neha Narkhede 2012-11-14, 17:42
-
Re: On SIGTERM and SIGKILL consumer ids in zookeeper do not always get cleared
David Arthur 2012-11-19, 19:34
BTW, this is a feature of the ephemeral nodes in ZooKeeper, not a feature of Kafka. If Kafka doesn't close the ZK connection explicitly, ZK waits for the session timeout before deleting ephemeral nodes.
On Nov 14, 2012, at 12:42 PM, Neha Narkhede wrote:
> Hi, > > Since you kill the consumers, it takes roughly upto > zookeeper.sessiontimeout.ms time for zookeeper to detect that the > consumer is gone. > If you want to restart the consumers immediately, try shutting them > down cleanly. (kill -15 not kill -9) > > Thanks, > Neha > > On Wed, Nov 14, 2012 at 8:52 AM, Varun Vijayaraghavan > <[EMAIL PROTECTED]> wrote: >> Hi! >> >> I am a zookeeper based consumer using the brod library (which is a python >> library for Kafka). I stop and start consumers using a simple shell script. >> >> I have noticed that when I kill and immediately start running ~50 consumers >> in a topic, some of the consumer ids in zookeeper do not get cleared. This >> messes up rebalancing, and partitions get assigned to zombie consumers. >> >> Right now, I have taken the hacky approach, and wait for ~15 seconds after >> killing the existing consumers, before starting them. >> >> Have you seen this issue, and have you implemented anything to handle this >> in the JVM library? >> >> Additional info: >> >> I am using kafka 0.7.2 and zookeeper 3.4.4 >> >> >> -- >> - varun :)
+
David Arthur 2012-11-19, 19:34
|
|