Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - ConsumerRebalanceFailedException when broker unavailable


Copy link to this message
-
Re: ConsumerRebalanceFailedException
Joel Koshy 2013-07-16, 07:05
Yes - rebalance => consumers trying to coordinate through ZK.
Rebalances can happen when one or more of the following happen:
- a consumed topic partition appears or disappears - i.e., if a broker
comes or goes.
- a consumer instance in the group comes or goes
"goes" could also be triggered by session expirations in zookeeper -
typically caused by client-side GC or flaky connections to zookeeper.

On Mon, Jul 15, 2013 at 10:15 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> We have a small Kafka cluster (0.7.1 - 3 nodes) in EC2. The load is about
> 200 million events per day, each being few kilobytes. We have a single node
> zookeeper.
>
> Yesterday suddenly our Kafka clients started throwing the following
> exception:
> java.lang.RuntimeException: kafka.common.ConsumerRebalanceFailedException:
> CONSUMER_GROUP_NAME_ip-00-00-00-00.ec2.internal-1373821190828-5f78e9af
> can't rebalance after 4 retries
>     at
> com.gumgum.kafka.consumer.KafkaTemplate.executeWithBatch(KafkaTemplate.java:59)
>     at
> com.gumgum.storm.fileupload.GenericKafkaSpout.nextTuple(GenericKafkaSpout.java:73)
>     at
> backtype.storm.daemon.executor$fn__3968$fn__4009$fn__4010.invoke(executor.clj:433)
>     at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>
> None of the Kafka clients (ConsumerConenctor class) would start. They would
> fail with the exception.
>
> We tried restarting the clilents, restarting the zookeeper as well. But
> finally it all started working when we restarted all of our kafka brokers.
> We didn't lose any data because producers (going directly to the brokers
> through a load balancer) were working fine.
>
> I tried googling this issue and looks like lot of people have faced it, but
> couldn't get anything concrete.
>
> Given this, I have two questions:
>
> It will be nice if you can tell me why this can happen or point me to a
> link where I can understand it better. What does Consumer Rebalancing mean?
> Does that mean consumers are trying to coordinate amongst themselves using
> Zookeeper?
>
> On a separate note, are there any JMX parameters I need to be monitoring to
> make sure that my kafka cluster is healthy? How can I keep watch on my
> kafka cluster?
>
> Regards,
> Vaibhav Puranik
> GumGum