|
|
-
RE: Discussion on supporting a large number of clients for a zk ensembleVishal Kathuria 2011-07-01, 23:07
Thanks for the suggestion Dhruba.
I will open a Jira and continue the discussion there. I also got a chance to discuss some of the ideas at the zookeeper community meet yesterday. I have prototyped some of my ideas and I should soon be able to share the performance sceanarios and measurements too. Thanks! Vishal -----Original Message----- From: Dhruba Borthakur [mailto:[EMAIL PROTECTED]] Sent: Friday, July 01, 2011 2:50 PM To: [EMAIL PROTECTED] Subject: Re: Discussion on supporting a large number of clients for a zk ensemble Hi Ben/Camille: can you comment on Vishal's logs/config? The "local session" idea seems promising to me. Vishal: it would be nice if you create a JIRA with your proposal and we can continue discussion in the JIRA? thanks a bunch, dhruba On Mon, May 30, 2011 at 11:15 AM, Vishal Kathuria <[EMAIL PROTECTED]>wrote: > Thanks for looking at this Camille and Benjamin, > > setup: > There are 5 machines, 2 hosting clients and 3 hosting servers. > There is one client process on each of the client machines The client > process has 20 threads, each thread with 500 sessions. > So I have a total of 20K clients, so it isn't that high really > > Hardware > Two proc Intel(r) Xeon(r) Processor L5420 (total 8 cores) 8G RAM > > > The workload is fairly simple: > All sessions do is keep a watch on a node. Once the watch fires, the client > reads the contents of the node and puts the watch again. > There is one thread that is periodically updating the node being watched > (once every 30s - so very infrequent) > > When the system starts off, things are fine, then a few timers starts > missing and eventually there are lots of expired connections. > > The logs are really long, but pretty much repetitive, so I am attaching the > tail of the logs. > The client timeout is 300s > > JVM Parameters > -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:MaxGCPauseMillis=50 > -Dzookeeper.globalOutstandingLimit=30000 -Xms6000m -Xmx6000m -Xdebug > -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8180 > I have GC logging turned on. I am not seeing long GC pauses, so I don't > think that's it. > > Next steps I am trying > 1. Look at the CPU utilization on the server machines > 2. If the CPU is pegged at 100%, add some additional tracing in the server > to validate my hypothesis that the session tracker is getting overwhelmed > > If you folks have any other suggestions, that would greatly help. I started > working with zookeeper a couple of weeks ago so it is very likely I might be > missing something obvious. > > > Thanks! > Vishal > > -----Original Message----- > From: Benjamin Reed [mailto:[EMAIL PROTECTED]] > Sent: Sunday, May 29, 2011 8:42 PM > To: [EMAIL PROTECTED] > Subject: Re: Discussion on supporting a large number of clients for a zk > ensemble > > i second camille's suggestion. i also know there are other people looking > into using zookeeper with a large number of clients, so it would be good to > figure out what are the limits and then how to cross them. i like your > proposed solutions, but i would rather start down that road after we have > resolved the issues that we can for the normal clients. > > ben > > On Fri, May 27, 2011 at 4:23 PM, Fournier, Camille F. [Tech] < > [EMAIL PROTECTED]> wrote: > > I would recommend that you spend some time making sure that your guess > about the cause is correct before trying to design solutions to the problem. > Can you provide us some hard numbers, logs, and configuration information? > It's always possible that some aspect of your configuration that you hadn't > considered important is in fact the trigger here. > > > > Thanks, > > Camille > > > > -----Original Message----- > > From: Vishal Kathuria [mailto:[EMAIL PROTECTED]] > > Sent: Friday, May 27, 2011 6:32 PM > > To: [EMAIL PROTECTED] > > Subject: Discussion on supporting a large number of clients for a zk > > ensemble > > > > Hi Folks, > > I wanted to start a discussion on how we can support a large number of Connect to me at http://www.facebook.com/dhruba |