Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Re: ConsumerRebalanceFailedException


+
Jun Rao 2013-11-30, 01:50
+
Joe Stein 2013-11-29, 16:57
Copy link to this message
-
Re: ConsumerRebalanceFailedException
rebalance.backoff.ms

Thanks,

Jun
On Mon, Dec 2, 2013 at 11:31 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:

> Thanks for your insights, Jun. That is really helpful. I forgot to mention
> the cause of the issue in my previous
> Email. We have three brokers. I notice from the log that all three brokers
> re-registered themselves with zk.
> That means all of them were somehow offline for a short time and then
> automatically got online again. That
> caused the rebalance failure. While all the brokers are offline, I assume
> a consumer will constantly retry to
> establish connection again. How long is the interval between the retries?
> Is it max.fetch.wait + socket.timeout.ms?
> Thanks.
>
> Libo
>
>
> -----Original Message-----
> From: Jun Rao [mailto:[EMAIL PROTECTED]]
> Sent: Monday, December 02, 2013 11:55 AM
> To: [EMAIL PROTECTED]
> Subject: Re: ConsumerRebalanceFailedException
>
> Is the failure on the last rebalance? If so, some partitions will not have
> any consumers. A common reason for rebalance failure is that there is
> conflict in owning partitions among different consumers in the same group.
> Increasing the # retries and the amount of backoff time btw retires should
> help. Our default setting should be good enough if there are not too many
> topics being subscribed and the ZK latency is normal.
>
> Thanks,
>
> Jun
>
>
> On Mon, Dec 2, 2013 at 6:57 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
>
> > Actually, I saw this line in the log : can't rebalance after 4 retries.
> > What should I expect in this case? All consumers threads failed or
> > only some of them?
> > If I increase the number of retries or delay between retries, will
> > that help?
> >
> > Regards,
> >
> > Libo
> >
> >
> > -----Original Message-----
> > From: Jun Rao [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, November 29, 2013 8:50 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: ConsumerRebalanceFailedException
> >
> > Transient rebalance failures are ok. However, it's important that the
> > last rebalance in a sequence succeeds. Otherwise, some of the
> > partitions will not be consumed by any consumers.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, Nov 29, 2013 at 10:44 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> >
> > > You are right, Joe. I checked our brokers' log. We have three brokers.
> > > All of them failed to connect to zk at some point.
> > > So they were offline and later reregistered themselves with the zk.
> > > I don't know how many rebalance should be triggered in that case.
> > > There is only one exception found in consumer's log. My question is
> > > whether users need to do anything to handle
> ConsumerRebalanceFailedException.
> > >
> > > This is from consumer log:
> > >
> > > [28/11/13 16:38:56:056 PM EST] 102 ERROR
> > > consumer.ZookeeperConsumerConnector: [xxxxxxxxxx ], error during
> > > syncedRebalance
> > > kafka.common.ConsumerRebalanceFailedException: xxxxxxxxx can't
> > > rebalance after 4 retries
> > >         at
> > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.synce
> > > dR
> > > eb
> > > alance(ZookeeperConsumerConnector.scala:397)
> > >         at
> > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon
> > > $1
> > > .r
> > > un(ZookeeperConsumerConnector.scala:326)
> > >
> > > Regards,
> > >
> > > Libo
> > >
> > >
> > > -----Original Message-----
> > > From: Joe Stein [mailto:[EMAIL PROTECTED]]
> > > Sent: Friday, November 29, 2013 11:57 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: ConsumerRebalanceFailedException
> > >
> > > What is the full stack trace?  if you see "can't rebalance after 4
> > retries"
> > > then likely the problem is the broker is down or not available
> > >
> > > /*******************************************
> > >  Joe Stein
> > >  Founder, Principal Consultant
> > >  Big Data Open Source Security LLC
> > >  http://www.stealth.ly
> > >  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> > > ********************************************/

 
+
Jun Rao 2013-12-02, 17:13
+
Guozhang Wang 2013-12-30, 23:49
+
Hanish Bansal 2013-12-31, 03:56
+
Jun Rao 2013-12-31, 16:51
+
Hanish Bansal 2014-01-01, 04:41
+
Guozhang Wang 2013-12-31, 04:39
+
Hanish Bansal 2013-12-30, 09:59
+
Jun Rao 2013-12-30, 16:13
+
Neha Narkhede 2014-02-25, 21:03
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB