Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - determining zookeeper capacity requirements


Copy link to this message
-
Re: determining zookeeper capacity requirements
Ted Dunning 2012-12-06, 04:51
This sounds like configuration somewhere.

Have you checked the usual suspects:

a) GC on client or ZK cluster?

b) bad configuration on ZK which allows split quorum?  (really....
surprisingly common)

c) bad configuration on client for connect?

d) ZK swapping out due to inactivity during memory pressure?

On Thu, Dec 6, 2012 at 1:50 AM, Ian Kallen <[EMAIL PROTECTED]> wrote:

> Thanks for replying. AFAIK, the change rate isn't high. Though there's
> a storm cluster and a few other things whose internals I'm not
> familiar with, they may be poking their znodes at a high rate that I'm
> not aware of. The missed watches are on applications that don't have
> rapid changes in any of their nodes. But we regularly see clients not
> fire data watches, subsequent changes will fire them so the clients
> seem to be connected, just missing that first trigger. Also latencies
> will sometimes suffer pretty wide swings. So it had me wondering how
> to measure capacity utilization on the ensemble.
>
> On Wed, Dec 5, 2012 at 3:37 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
> > THis looks like very low load.
> >
> > What is the rate of change on znodes (i.e. what is the desired watch
> signal
> > rate)?
> >
> > On Wed, Dec 5, 2012 at 10:10 PM, Ian Kallen <[EMAIL PROTECTED]>
> wrote:
> >
> >> We have an ensemble of three servers and have observed varying
> >> latencies, watches that seemingly don't get fired on the client and
> >> other operational issues. Here are the current # connections/watches:
> >>
> >> shell$ for i in 1 2 3; do echo wchs | nc zoo-ensemble$i 2181; done
> >>
> >> 198 connections watching 174 paths
> >> Total watches:1914
> >> 41 connections watching 126 paths
> >> Total watches:1010
> >> 50 connections watching 143 paths
> >> Total watches:952
> >>
> >> I don't know if we should be concerned with the number of watches is
> >> in the thousands (or be concerned that zoo-ensemble1 is handling ~
> >> same number of watches as 2 & 3 combined). Should we be tuning the JVM
> >> in any particular way according to the number of watches? From a
> >> capacity planning standpoint, what metrics and guidelines should we be
> >> observing before we split our tree into separate ensembles or grow the
> >> current ensemble?
> >>
> >> thanks,
> >> -Ian
> >>
>