Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Distribution Problems With Multiple Zookeeper Clients


Copy link to this message
-
RE: Distribution Problems With Multiple Zookeeper Clients
Narasimha Tadepalli 2012-05-25, 17:30
Actually we are locking the jobs before accepting new jobs. None of the workers won't lock the job if it is not ready to process yet. Let me ask you this in relation to your second response where you expressed some good assumptions.

Below stats give you some rough estimate on what exactly going on.

Zookeeper Client Total Number of Jobs processed in two hour time

Client Instance1 ------------------>         100
Client Instance2 ------------------>         90
Client Instance3 ------------------>         80
Client Instance4 ------------------>         70
Client Instance5 ------------------>         60
Client Instance6 ------------------>         50
Client Instance7 ------------------>         40
Client Instance8 ------------------>         30
All these instances started 24 hours back in different time slots, but data which I presented here for last two hours. Your assumption was Client Instance1 registered with server first and that's why it succeeding the race in receiving the event notification first always. Which is right also after verifying the facts. But my problem here how do I force each of this clients to perform equally or approximately equal. Ie. All worker instances should able to process 65 jobs in two hours ( all 8 workers processed 520 which is divided by 8 = 65). As I mentioned it doesn't have to be exact 65 but not 30 or 100. I hope you can understand my situation clearly now. BTW in reality we launch workers between 50 to 100 in a day.

Thanks
Narasimha
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Camille Fournier
Sent: Friday, May 25, 2012 11:48 AM
To: [EMAIL PROTECTED]
Subject: Re: Distribution Problems With Multiple Zookeeper Clients

If your code is doing the following:
client gets watch notification
client immediately tries to grab lock
client then puts job in queue to process

That's not going to work.

You need to do
client gets watch notification
client puts lock grab in queue with work that is being processed when queue has bandwidth, try to grab lock and process job

The grabbing of the lock to do work and the queue of threads available to do work need to be coupled, otherwise you are grabbing work you don't have capacity to do.

You can also hack this by
client gets watch notification
client does a random sleep or a sleep based on amount of work currently on this machine, then tries to grab lock

C

On Fri, May 25, 2012 at 12:41 PM, Narasimha Tadepalli < [EMAIL PROTECTED]> wrote:

> No actually server keep accumulating lot of jobs in queue which are
> not picking up by any of these idle worker instances. Those jobs are
> waiting until the other workers finished their currently processing
> jobs. Where do you exactly suggesting me to put sleeps to prevent
> watchers receiving events further? As long as zookeeper session is
> active I didn't find any way to control these watchers to stop
> receiving events. Please advise me if there is way to control it.
>
> Thanks
> Narasimha
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
> Camille Fournier
> Sent: Thursday, May 24, 2012 5:22 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Distribution Problems With Multiple Zookeeper Clients
>
> You can put random sleeps in after you get a notification before you
> try to grab the lock, or sleeps based on the active job count, to
> favor workers with no or few jobs in flight. It seems to me that if
> you have limited the jobs able to be processed on a worker by limiting
> your thread pool appropriately, and if you still aren't hitting all 30
> servers, maybe you don't need 30 servers to be doing these jobs? Is that possible?
>
> C
>
>
> On Thu, May 24, 2012 at 3:55 PM, Narasimha Tadepalli <
> [EMAIL PROTECTED]> wrote:
>
> > Hi Camille
> >
> > I tried to control the job load at zookeeper clients by minimizing
> > the number of jobs to process, but no luck in forcing the other idle