Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Distribution Problems With Multiple Zookeeper Clients

Narasimha Tadepalli 2012-05-15, 17:29
Camille Fournier 2012-05-15, 18:20
Narasimha Tadepalli 2012-05-16, 19:46
Camille Fournier 2012-05-17, 18:21
Copy link to this message
RE: Distribution Problems With Multiple Zookeeper Clients
Hi Camille

Your assumption totally right. When I verified again the clients are getting more events based on their priority of registration with server. The clients which are registered late getting less notifications. As you suggested I will check where I can control that behavior either in server side or client side.


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Camille Fournier
Sent: Thursday, May 17, 2012 1:22 PM
Subject: Re: Distribution Problems With Multiple Zookeeper Clients

The below is written assuming that all clients are seeing all events, but then they race to get a lock of some sort to do the work, and the same 10 are always getting the lock to do the work. If in fact not all of your clients are even getting all the events, that's another problem.
So here's what I think happens, although other devs that know this code better may prove me wrong. When a client connects to a server and creates a watch for a particular path, that member of the ZK quorum adds the watch for that path to a WatchManager. The WatchManager underlying has a HashSet that contains the watches for that path. When an event happens on that path, the server will iteratate through the watchers on that path and send them the watch notification.
It's quite possible that if your events are infrequent and/or your client servers aren't that loaded, what will happen is that the first few clients that registered that watch on each quorum member are likely to receive and process the watch first because their notifications were sent first, and will also always reset the watch for that path first if your code is written to reset the watch immediately upon receiving the notification.
They always win the race, and thus always do all the work.
In general, the indication is that you have more clients that you need available to do the work you want to do. If in fact you don't, perhaps the right thing to do is to investigate how you are handing off work and responding to watch notifications within your client. IE, if you have a client that is already doing some work and it gets a watch notification, it may not want to race for the lock. You may want to schedule trying to get the lock and then process more work in a limited thread pool, so that you know there's a limit of N tasks that can be being performed by each client and thus scope the max load on each server.

Does this make sense?


On Wed, May 16, 2012 at 3:46 PM, Narasimha Tadepalli < [EMAIL PROTECTED]> wrote:

> Hi Camille
> Sorry for the confusion. Yes it is watches. We have multiple clients
> configured to watch on event change at server end. For example we have
> a data directory of /data/345/text. All 30 clients keep watching for
> event change under /data/345 directory if there is any change clients
> need to process and read the child nodes. In this situation not all
> clients not getting equal events. I am looking for a way to distribute
> the load equally to all client instances. I hope I provided enough
> clarification now or else let me know.
> Thanks
> Narasimha
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
> Camille Fournier
> Sent: Tuesday, May 15, 2012 1:20 PM
> Subject: Re: Distribution Problems With Multiple Zookeeper Clients
> I'm not sure what you mean by messages. Are you talking about watches?
> Can you describe your clients in more detail?
> Thanks,
> Camille
> On Tue, May 15, 2012 at 1:29 PM, Narasimha Tadepalli <
> > Dear All
> >
> > We have a situation where messages are not distributed equally when
> > we have multiple clients listening to one zookeeper cluster. Say we
> > have
> > 30 client instances listing to one cluster and when 1000 messages
> > submitted in
> > 30 mins to cluster I assume each client approximately supposed to
Narasimha Tadepalli 2012-05-24, 19:55
Camille Fournier 2012-05-24, 22:22
Narasimha Tadepalli 2012-05-25, 16:41
Camille Fournier 2012-05-25, 16:48
Narasimha Tadepalli 2012-05-25, 17:30
Camille Fournier 2012-05-25, 17:58