Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Re: ConsumerRebalanceFailedException


Copy link to this message
-
Re: ConsumerRebalanceFailedException
Jun Rao 2013-12-03, 05:45
rebalance.backoff.ms

Thanks,

Jun
On Mon, Dec 2, 2013 at 11:31 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:

> Thanks for your insights, Jun. That is really helpful. I forgot to mention
> the cause of the issue in my previous
> Email. We have three brokers. I notice from the log that all three brokers
> re-registered themselves with zk.
> That means all of them were somehow offline for a short time and then
> automatically got online again. That
> caused the rebalance failure. While all the brokers are offline, I assume
> a consumer will constantly retry to
> establish connection again. How long is the interval between the retries?
> Is it max.fetch.wait + socket.timeout.ms?
> Thanks.
>
> Libo
>
>
> -----Original Message-----
> From: Jun Rao [mailto:[EMAIL PROTECTED]]
> Sent: Monday, December 02, 2013 11:55 AM
> To: [EMAIL PROTECTED]
> Subject: Re: ConsumerRebalanceFailedException
>
> Is the failure on the last rebalance? If so, some partitions will not have
> any consumers. A common reason for rebalance failure is that there is
> conflict in owning partitions among different consumers in the same group.
> Increasing the # retries and the amount of backoff time btw retires should
> help. Our default setting should be good enough if there are not too many
> topics being subscribed and the ZK latency is normal.
>
> Thanks,
>
> Jun
>
>
> On Mon, Dec 2, 2013 at 6:57 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
>
> > Actually, I saw this line in the log : can't rebalance after 4 retries.
> > What should I expect in this case? All consumers threads failed or
> > only some of them?
> > If I increase the number of retries or delay between retries, will
> > that help?
> >
> > Regards,
> >
> > Libo
> >
> >
> > -----Original Message-----
> > From: Jun Rao [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, November 29, 2013 8:50 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: ConsumerRebalanceFailedException
> >
> > Transient rebalance failures are ok. However, it's important that the
> > last rebalance in a sequence succeeds. Otherwise, some of the
> > partitions will not be consumed by any consumers.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, Nov 29, 2013 at 10:44 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> >
> > > You are right, Joe. I checked our brokers' log. We have three brokers.
> > > All of them failed to connect to zk at some point.
> > > So they were offline and later reregistered themselves with the zk.
> > > I don't know how many rebalance should be triggered in that case.
> > > There is only one exception found in consumer's log. My question is
> > > whether users need to do anything to handle
> ConsumerRebalanceFailedException.
> > >
> > > This is from consumer log:
> > >
> > > [28/11/13 16:38:56:056 PM EST] 102 ERROR
> > > consumer.ZookeeperConsumerConnector: [xxxxxxxxxx ], error during
> > > syncedRebalance
> > > kafka.common.ConsumerRebalanceFailedException: xxxxxxxxx can't
> > > rebalance after 4 retries
> > >         at
> > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.synce
> > > dR
> > > eb
> > > alance(ZookeeperConsumerConnector.scala:397)
> > >         at
> > > kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon
> > > $1
> > > .r
> > > un(ZookeeperConsumerConnector.scala:326)
> > >
> > > Regards,
> > >
> > > Libo
> > >
> > >
> > > -----Original Message-----
> > > From: Joe Stein [mailto:[EMAIL PROTECTED]]
> > > Sent: Friday, November 29, 2013 11:57 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: ConsumerRebalanceFailedException
> > >
> > > What is the full stack trace?  if you see "can't rebalance after 4
> > retries"
> > > then likely the problem is the broker is down or not available
> > >
> > > /*******************************************
> > >  Joe Stein
> > >  Founder, Principal Consultant
> > >  Big Data Open Source Security LLC
> > >  http://www.stealth.ly
> > >  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> > > ********************************************/