Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka rebalancing causes Zookeeper to fail


Copy link to this message
-
Re: Kafka rebalancing causes Zookeeper to fail
You can find some of the GC settings in
https://cwiki.apache.org/confluence/display/KAFKA/Operations

There were some ZK bugs exposed during session expiration, which were fixed
in 3.3.4. Not sure if 3.4.5 exposes any new issues. The easiest thing is
probably to avoid GC-induced ZK session timeout in the first place or use a
larger session timeout.

Thanks,

Jun
On Wed, Jan 22, 2014 at 8:29 AM, Ahmed H. <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I looked at that, not sure if it is applicable or not at this point. We
> used to have frequent rebalances, but that issue was mitigated by
> increasing the zktimeout on the consumer side. With that said, it may still
> be a problem. I have't collected any metrics concerning rebalances in a
> while. I will certainly take a look at our current GC settings. What are
> typical settings that we should have for GC (I am not sure of what exactly
> I'm looking for)?
>
> As for downgrading the Zookeeper version, would there be any major loss of
> functionality? Version 3.4.5 is currently stable, so I am unsure of how it
> would help. I can try it and let it soak for a while to see if it helps or
> not. The problem is we have many components that tie into Zookeeper and I'm
> worried that downgrading may break some of our API calls to it.
>
> Is there a good way of trying to narrow this problem down further?
>
> Thanks again
>
>
> On Wed, Jan 22, 2014 at 10:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
> > Not sure how stable ZK 3.4.5 is. Could you try 3.3.4? Also, see if
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog
> > ?
> > is applicable.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Jan 22, 2014 at 6:24 AM, Ahmed H. <[EMAIL PROTECTED]>
> wrote:
> >
> > > I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1,
> and
> > > Zookeeper 3.4.5. The activity on this machine isn't massive...I would
> say
> > > the Kafka queues get a consistent 1 message every 2-3 seconds, as well
> as
> > > occasional spikes, but still nothing large enough to push the limits.
> > Both
> > > Kafka and Zookeeper are running on the same machine.
> > >
> > > Occasionally, a rebalance is triggered, which causes our Kafka clients
> to
> > > try reconnecting several times, but it ultimately fails with the
> > following
> > > error:
> > >
> > >
> > > 04:56:10,020 INFO  [kafka.consumer.ZookeeperConsumerConnector]
> > >
> (alarms.topology.updates_<host>-1383643783747-c7775701_watcher_executor)
> > > [alarms.topology.updates_<host>-1383643783747-c7775701], exception
> > > during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException:
> > > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> > > = NoNode for
> > >
> >
> /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701
> > >         at
> > > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > > [zkclient-0.3.jar:0.3]
> > >         at
> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > > [zkclient-0.3.jar:0.3]
> > >         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> > > [zkclient-0.3.jar:0.3]
> > >         at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> > > [zkclient-0.3.jar:0.3]
> > >         at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407)
> > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > >         at
> > > kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52)
> > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > >         at
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401)
> > > [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
> > >         at
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB