Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # dev - Discussion on supporting a large number of clients for a zk ensemble

Copy link to this message
RE: Discussion on supporting a large number of clients for a zk ensemble
Vishal Kathuria 2011-05-30, 18:15
Thanks for looking at this Camille and Benjamin,

There are 5 machines, 2 hosting clients and 3 hosting servers.
There is one client process on each of the client machines
The client process has 20 threads, each thread with 500 sessions.
So I have a total of 20K clients, so it isn't that high really

Two proc Intel® Xeon® Processor L5420  (total 8 cores)
The workload is fairly simple:
All sessions do is keep a watch on a node. Once the watch fires, the client reads the contents of the node and puts the watch again.
There is one thread that is periodically updating the node being watched (once every 30s - so very infrequent)

When the system starts off, things are fine, then a few timers starts missing and eventually there are lots of expired connections.

The logs are really long, but pretty much repetitive, so I am attaching the tail of the logs.
The client timeout is 300s

JVM Parameters
-XX:+UseConcMarkSweepGC  -XX:+PrintGCDetails -XX:MaxGCPauseMillis=50 -Dzookeeper.globalOutstandingLimit=30000 -Xms6000m -Xmx6000m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8180
I have GC logging turned on. I am not seeing long GC pauses, so I don't think that's it.

Next steps I am trying
1. Look at the CPU utilization on the server machines
2. If the CPU is pegged at 100%, add some additional tracing in the server to validate my hypothesis that the session tracker is getting overwhelmed

If you folks have any other suggestions, that would greatly help. I started working with zookeeper a couple of weeks ago so it is very likely I might be missing something obvious.

-----Original Message-----
From: Benjamin Reed [mailto:[EMAIL PROTECTED]]
Sent: Sunday, May 29, 2011 8:42 PM
Subject: Re: Discussion on supporting a large number of clients for a zk ensemble

i second camille's suggestion. i also know there are other people looking into using zookeeper with a large number of clients, so it would be good to figure out what are the limits and then how to cross them. i like your proposed solutions, but i would rather start down that road after we have resolved the issues that we can for the normal clients.


On Fri, May 27, 2011 at 4:23 PM, Fournier, Camille F. [Tech] <[EMAIL PROTECTED]> wrote:
> I would recommend that you spend some time making sure that your guess about the cause is correct before trying to design solutions to the problem. Can you provide us some hard numbers, logs, and configuration information? It's always possible that some aspect of your configuration that you hadn't considered important is in fact the trigger here.
> Thanks,
> Camille
> -----Original Message-----
> From: Vishal Kathuria [mailto:[EMAIL PROTECTED]]
> Sent: Friday, May 27, 2011 6:32 PM
> Subject: Discussion on supporting a large number of clients for a zk
> ensemble
> Hi Folks,
> I wanted to start a discussion on how we can support a large number of
> clients in zookeeper.  I am at facebook and we are using zookeeper for
> quite a few projects. There are a couple of projects where we are
> designing for a large number of clients. The projects are
> 1.       Building a directory service for holding configuration information (lookup table for which node to go to for a given key).
> 2.       For HDFS clients, where clients lookup zookeeper for the
> current namenode
> This information changes infrequently and is small, so update rate or size of data is not an issue.
> The key challenge is to support that large a number of clients (30K to start with, but eventually could be 100K).  A big chunk of the clients can try to connect/disconnect at the same time  - so herd effect can happen.
> I was trying out a 3 node ensemble. I noticed that with about 20K clients, there we quite a few session expires and disconnects.
> I looked through the code briefly and since all the pings are eventually handled by the leader, my guess is that the leader thread is not keeping up. I haven't yet do the instrumentation/tracing to validate this.