Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Zookeeper session losing some watchers


Copy link to this message
-
Re: Zookeeper session losing some watchers
Jamie,

We do use chroot. However, the chroot problem will lose all watchers, not
some watchers, right?

Thanks,

Jun

On Wed, Nov 2, 2011 at 7:34 PM, Jamie Rothfeder
<[EMAIL PROTECTED]>wrote:

> Hi Neha,
>
> I encountered a similar problem with zookeeper losing watches and found
> that it was related to this bug:
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-961
>
> Are you using a chroot?
>
> Thanks,
> Jamie
>
> On Wed, Nov 2, 2011 at 1:16 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > Hi,
> >
> > We've been seeing a problem with our zookeeper servers lately, where
> > all of a sudden a session loses some of the watchers registered on
> > some of the znodes. Let me explain our Kafka-ZK setup. We have a Kafka
> > cluster in one DC establishing sessions (with 6sec timeout) with a ZK
> > cluster (of 4 machines) in another DC and registers watchers on some
> > zookeeper paths. Every couple of weeks, we observe some problem with
> > the Kafka servers, where on investigating further, we find that the
> > session lost some of the key watches, but not all.
> >
> > The last time this happened, we ran the wchc command on the ZK servers
> > and saw the problem. Unfortunately, we lost relevant information from
> > the ZK logs by the time we were ready to debug it further. Since this
> > causes Kafka servers to stop making progress, we want to setup some
> > kind of alert when this happens. This will help us collect more
> > information to give you. Particularly, we were thinking about running
> > wchp periodically (maybe once a minute), grepping for the ZK paths and
> > counting the number of watches that should be registered for correct
> > operation. But I observed that the watcher info is not replicated
> > across all ZK servers, so we would have to query every ZK server to
> > inorder to get the full list.
> >
> > I'm not sure running wchp periodically on all ZK servers is the best
> > option for this alert. Can you think of what could be the problem here
> > and how we can setup this alert for now ?
> >
> > Thanks
> > Neha
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB