|
Olivier Van Acker
2011-11-25, 10:45
nileader
2011-11-26, 05:55
Jordan Zimmerman
2011-11-26, 06:17
Ted Dunning
2011-11-26, 07:37
Jérémie BORDIER
2011-11-26, 10:33
Ted Dunning
2011-11-26, 19:41
Olivier Van Acker
2011-11-28, 12:28
Olivier Van Acker
2011-11-28, 16:29
Jérémie BORDIER
2011-11-28, 16:35
Ted Dunning
2011-11-28, 16:41
Olivier Van Acker
2011-11-28, 16:47
|
-
zookeeper leadership election example applicationOlivier Van Acker 2011-11-25, 10:45
I've written a example app on how to do implement leadership election in
with zookeeper Is there anyone on the list who'd like to review my app and if it needs improvement or not? the app is on github: https://github.com/cyberroadie/zookeeper-leader and explained how it works on my blog: http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-with-zookeeper/ Olivier
-
Re: zookeeper leadership election example applicationnileader 2011-11-26, 05:55
A good idea.
2011/11/25, Olivier Van Acker <[EMAIL PROTECTED]>: > I've written a example app on how to do implement leadership election in > with zookeeper > Is there anyone on the list who'd like to review my app and if it needs > improvement or not? > > the app is on github: > https://github.com/cyberroadie/zookeeper-leader > > and explained how it works on my blog: > http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-with-zookeeper/ > > Olivier > -- *nileader* ni掌櫃的個人郵箱 *MSN*: [EMAIL PROTECTED] *Weibo*:http://weibo.com/nileader ———————————————————————————————————————————————————————————————————————— This email (including any attachments) is confidential and may be legally privileged, private information of correct recipient and nileader. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you. * 本電郵(包括任何附件)可能含有機密資料並受法律保護,屬於ni掌櫃和正確收件人之間的私有信息。如您不是正確的收件人,請您立即刪除本郵件。請不要將本電郵進行複製並用作任何其它用途,或透露本郵件之內容。謝謝。 *
-
Re: zookeeper leadership election example applicationJordan Zimmerman 2011-11-26, 06:17
A few comments:
* NodeMonitor.createRootIfNotExists() should catch NodeExists. In the case of multiple clients, this is a possibility. * I'd add a start method and create the ZooKeeper instance there. This gives users a chance to set a listener so as to receive all messages. * All ZooKeeper operations should be in some kind of retry loop. The client can lose connection to a given server but successfully reconnect to another one in the cluster. * When creating the Znode, it can succeed on the server but fail to return the result to the client. On a Disconnect/Session exception, you should retry and then call getChildren and search for your node. -JZ On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: >I've written a example app on how to do implement leadership election in >with zookeeper >Is there anyone on the list who'd like to review my app and if it needs >improvement or not? > >the app is on github: >https://github.com/cyberroadie/zookeeper-leader > >and explained how it works on my blog: >http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w >ith-zookeeper/ > >Olivier
-
Re: zookeeper leadership election example applicationTed Dunning 2011-11-26, 07:37
I think that the code also needs:
* comments. You need to put some java docs in that say what the different classes and methods are intended to do. External documentation is nice, but hardly sufficient. * global handling of disconnection and pausing of masters during a disconnect. * a description of how you think you are handling error conditions * tests that demonstrate that you handle error conditions I would also take issue with Jordan's suggestion for retry logic. With ephemeral sequential node, retries are very dangerous in certain corner failure modes. This is the primary reason that I prefer the single file leader election method. With a reasonable number of masters (the most common case) this is completely sufficient since the herd effect isn't a problem for such a small herd. On Fri, Nov 25, 2011 at 10:17 PM, Jordan Zimmerman <[EMAIL PROTECTED]>wrote: > A few comments: > > * NodeMonitor.createRootIfNotExists() should catch NodeExists. In the case > of multiple clients, this is a possibility. > > * I'd add a start method and create the ZooKeeper instance there. This > gives users a chance to set a listener so as to receive all messages. > > * All ZooKeeper operations should be in some kind of retry loop. The > client can lose connection to a given server but successfully reconnect to > another one in the cluster. > > * When creating the Znode, it can succeed on the server but fail to return > the result to the client. On a Disconnect/Session exception, you should > retry and then call getChildren and search for your node. > > -JZ > > On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: > > >I've written a example app on how to do implement leadership election in > >with zookeeper > >Is there anyone on the list who'd like to review my app and if it needs > >improvement or not? > > > >the app is on github: > >https://github.com/cyberroadie/zookeeper-leader > > > >and explained how it works on my blog: > > > http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w > >ith-zookeeper/ > > > >Olivier > >
-
Re: zookeeper leadership election example applicationJérémie BORDIER 2011-11-26, 10:33
Hello Ted,
Can you point to a good implementation of that single file leadership election method you're describing ? Thanks, Jérémie On Sat, Nov 26, 2011 at 8:37 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I think that the code also needs: > > * comments. You need to put some java docs in that say what the different > classes and methods are intended to do. External documentation is nice, > but hardly sufficient. > > * global handling of disconnection and pausing of masters during a > disconnect. > > * a description of how you think you are handling error conditions > > * tests that demonstrate that you handle error conditions > > I would also take issue with Jordan's suggestion for retry logic. With > ephemeral sequential node, retries are very dangerous in certain corner > failure modes. This is the primary reason that I prefer the single file > leader election method. With a reasonable number of masters (the most > common case) this is completely sufficient since the herd effect isn't a > problem for such a small herd. > > On Fri, Nov 25, 2011 at 10:17 PM, Jordan Zimmerman > <[EMAIL PROTECTED]>wrote: > >> A few comments: >> >> * NodeMonitor.createRootIfNotExists() should catch NodeExists. In the case >> of multiple clients, this is a possibility. >> >> * I'd add a start method and create the ZooKeeper instance there. This >> gives users a chance to set a listener so as to receive all messages. >> >> * All ZooKeeper operations should be in some kind of retry loop. The >> client can lose connection to a given server but successfully reconnect to >> another one in the cluster. >> >> * When creating the Znode, it can succeed on the server but fail to return >> the result to the client. On a Disconnect/Session exception, you should >> retry and then call getChildren and search for your node. >> >> -JZ >> >> On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: >> >> >I've written a example app on how to do implement leadership election in >> >with zookeeper >> >Is there anyone on the list who'd like to review my app and if it needs >> >improvement or not? >> > >> >the app is on github: >> >https://github.com/cyberroadie/zookeeper-leader >> > >> >and explained how it works on my blog: >> > >> http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w >> >ith-zookeeper/ >> > >> >Olivier >> >> > -- Jérémie 'ahFeel' BORDIER
-
Re: zookeeper leadership election example applicationTed Dunning 2011-11-26, 19:41
I have embedded similar code into a number of programs, but I haven't
published them. You could take a look at hbase. I think it uses the single file technique. On Sat, Nov 26, 2011 at 2:33 AM, Jérémie BORDIER <[EMAIL PROTECTED]>wrote: > Hello Ted, > > Can you point to a good implementation of that single file leadership > election method you're describing ? > > Thanks, > Jérémie > > On Sat, Nov 26, 2011 at 8:37 AM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > I think that the code also needs: > > > > * comments. You need to put some java docs in that say what the > different > > classes and methods are intended to do. External documentation is nice, > > but hardly sufficient. > > > > * global handling of disconnection and pausing of masters during a > > disconnect. > > > > * a description of how you think you are handling error conditions > > > > * tests that demonstrate that you handle error conditions > > > > I would also take issue with Jordan's suggestion for retry logic. With > > ephemeral sequential node, retries are very dangerous in certain corner > > failure modes. This is the primary reason that I prefer the single file > > leader election method. With a reasonable number of masters (the most > > common case) this is completely sufficient since the herd effect isn't a > > problem for such a small herd. > > > > On Fri, Nov 25, 2011 at 10:17 PM, Jordan Zimmerman > > <[EMAIL PROTECTED]>wrote: > > > >> A few comments: > >> > >> * NodeMonitor.createRootIfNotExists() should catch NodeExists. In the > case > >> of multiple clients, this is a possibility. > >> > >> * I'd add a start method and create the ZooKeeper instance there. This > >> gives users a chance to set a listener so as to receive all messages. > >> > >> * All ZooKeeper operations should be in some kind of retry loop. The > >> client can lose connection to a given server but successfully reconnect > to > >> another one in the cluster. > >> > >> * When creating the Znode, it can succeed on the server but fail to > return > >> the result to the client. On a Disconnect/Session exception, you should > >> retry and then call getChildren and search for your node. > >> > >> -JZ > >> > >> On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: > >> > >> >I've written a example app on how to do implement leadership election > in > >> >with zookeeper > >> >Is there anyone on the list who'd like to review my app and if it needs > >> >improvement or not? > >> > > >> >the app is on github: > >> >https://github.com/cyberroadie/zookeeper-leader > >> > > >> >and explained how it works on my blog: > >> > > >> > http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w > >> >ith-zookeeper/ > >> > > >> >Olivier > >> > >> > > > > > > -- > Jérémie 'ahFeel' BORDIER >
-
Re: zookeeper leadership election example applicationOlivier Van Acker 2011-11-28, 12:28
Thanks for the suggestions,
I've implemented your first two suggestions, just a couple of questions about the other ones: > * All ZooKeeper operations should be in some kind of retry loop. The > client can lose connection to a given server but successfully reconnect to > another one in the cluster. > > I thought this retry happend via process(WatchedEvent event) where the event.getType() == Watcher.Event.EventType.None and event.getState() == SyncConnected I catch this in NodeMonitor.processNoneEvent() where I try to create the client znode again on reconnecting > * When creating the Znode, it can succeed on the server but fail to return > the result to the client. On a Disconnect/Session exception, you should > retry and then call getChildren and search for your node. > > I'm creating the Znodes as ephemeral nodes, would a disconnect not result in the disappearance of the Znode and therefore getChildren would never contain the client Znode? Cheers, Olivier > -JZ > > On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: > > >I've written a example app on how to do implement leadership election in > >with zookeeper > >Is there anyone on the list who'd like to review my app and if it needs > >improvement or not? > > > >the app is on github: > >https://github.com/cyberroadie/zookeeper-leader > > > >and explained how it works on my blog: > > > http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w > >ith-zookeeper/ > > > >Olivier > >
-
Re: zookeeper leadership election example applicationOlivier Van Acker 2011-11-28, 16:29
>
> > failure modes. This is the primary reason that I prefer the single file > leader election method. > What is single file leader election? A quick search on google didn't come up with any results :-( Olivier > > On Fri, Nov 25, 2011 at 10:17 PM, Jordan Zimmerman > <[EMAIL PROTECTED]>wrote: > > > A few comments: > > > > * NodeMonitor.createRootIfNotExists() should catch NodeExists. In the > case > > of multiple clients, this is a possibility. > > > > * I'd add a start method and create the ZooKeeper instance there. This > > gives users a chance to set a listener so as to receive all messages. > > > > * All ZooKeeper operations should be in some kind of retry loop. The > > client can lose connection to a given server but successfully reconnect > to > > another one in the cluster. > > > > * When creating the Znode, it can succeed on the server but fail to > return > > the result to the client. On a Disconnect/Session exception, you should > > retry and then call getChildren and search for your node. > > > > -JZ > > > > On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: > > > > >I've written a example app on how to do implement leadership election in > > >with zookeeper > > >Is there anyone on the list who'd like to review my app and if it needs > > >improvement or not? > > > > > >the app is on github: > > >https://github.com/cyberroadie/zookeeper-leader > > > > > >and explained how it works on my blog: > > > > > > http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w > > >ith-zookeeper/ > > > > > >Olivier > > > > >
-
Re: zookeeper leadership election example applicationJérémie BORDIER 2011-11-28, 16:35
Here's the implementation in the HBase source repository that Ted pointed out:
http://svn.apache.org/viewvc/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKLeaderManager.java?view=markup Jérémie On Mon, Nov 28, 2011 at 5:29 PM, Olivier Van Acker <[EMAIL PROTECTED]> wrote: >> >> >> failure modes. This is the primary reason that I prefer the single file >> leader election method. >> > > What is single file leader election? A quick search on google didn't come > up with any results :-( > > Olivier > > > > >> >> On Fri, Nov 25, 2011 at 10:17 PM, Jordan Zimmerman >> <[EMAIL PROTECTED]>wrote: >> >> > A few comments: >> > >> > * NodeMonitor.createRootIfNotExists() should catch NodeExists. In the >> case >> > of multiple clients, this is a possibility. >> > >> > * I'd add a start method and create the ZooKeeper instance there. This >> > gives users a chance to set a listener so as to receive all messages. >> > >> > * All ZooKeeper operations should be in some kind of retry loop. The >> > client can lose connection to a given server but successfully reconnect >> to >> > another one in the cluster. >> > >> > * When creating the Znode, it can succeed on the server but fail to >> return >> > the result to the client. On a Disconnect/Session exception, you should >> > retry and then call getChildren and search for your node. >> > >> > -JZ >> > >> > On 11/25/11 2:45 AM, "Olivier Van Acker" <[EMAIL PROTECTED]> wrote: >> > >> > >I've written a example app on how to do implement leadership election in >> > >with zookeeper >> > >Is there anyone on the list who'd like to review my app and if it needs >> > >improvement or not? >> > > >> > >the app is on github: >> > >https://github.com/cyberroadie/zookeeper-leader >> > > >> > >and explained how it works on my blog: >> > > >> > >> http://cyberroadie.wordpress.com/2011/11/24/implementing-leader-election-w >> > >ith-zookeeper/ >> > > >> > >Olivier >> > >> > >> > -- Jérémie 'ahFeel' BORDIER
-
Re: zookeeper leadership election example applicationTed Dunning 2011-11-28, 16:41
The idea is that all the leader candidates try to create the same ephemeral
file. One wins. That one is the winner. On disconnect, that leader pauses its services. On reconnect it either continues or is notified of session expiration. All the other candidates watch the file and if it vanishes, they try again. This is very simple and doesn't depend on sequential ephemerals. It is subject to notification storms and the so-called herd effect, but for <50 candidates this doesn't much matter. On Mon, Nov 28, 2011 at 8:29 AM, Olivier Van Acker <[EMAIL PROTECTED]>wrote: > > What is single file leader election? A quick search on google didn't come > up with any results :-(
-
Re: zookeeper leadership election example applicationOlivier Van Acker 2011-11-28, 16:47
>
> > > This is very simple and doesn't depend on sequential ephemerals. It is > subject to notification storms and the so-called herd effect, but for <50 > candidates this doesn't much matter. > > very elegant, I like it, ill implement it in the example, Olivier > On Mon, Nov 28, 2011 at 8:29 AM, Olivier Van Acker <[EMAIL PROTECTED] > >wrote: > > > > > What is single file leader election? A quick search on google didn't come > > up with any results :-( > |