|
Yang
2011-07-16, 08:44
Camille Fournier
2011-07-16, 15:46
Benjamin Reed
2011-07-17, 00:27
Yang
2011-07-18, 08:00
Fournier, Camille F.
2011-07-18, 13:51
Ted Dunning
2011-07-18, 15:52
Yang
2011-07-18, 16:02
Yang
2011-07-18, 16:07
Fournier, Camille F.
2011-07-18, 16:30
Yang
2011-07-18, 16:39
Fournier, Camille F.
2011-07-18, 18:38
Ted Dunning
2011-07-19, 20:34
|
-
help on Zookeeper code walk through?Yang 2011-07-16, 08:44
I'm wondering if a client loses session to its ephemeral znode, under
the hood, how is the watcher triggered? went through the code , and found something that looks related: ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process() but how is ZKDataBase.killSession() called? from the info given in http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#ch_zkSessions I can see the ZooKeeper client code does periodically ping the server to maintain liveness. but how the server checks for this liveness and trigger killSession(), here I'm having difficulty connecting the dots. could you please give me some help walking through this piece of code? Thanks Yang
-
Re: help on Zookeeper code walk through?Camille Fournier 2011-07-16, 15:46
Check out the SessionTracker and SessionTrackerImpl, that is what the
servers use to keep track of session liveness. On Sat, Jul 16, 2011 at 4:44 AM, Yang <[EMAIL PROTECTED]> wrote: > I'm wondering if a client loses session to its ephemeral znode, under > the hood, how > is the watcher triggered? > > went through the code , and found something that looks related: > ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process() > > but how is ZKDataBase.killSession() called? from the info given in > http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#ch_zkSessions > I can see the ZooKeeper client code does periodically ping the server > to maintain liveness. but how the server checks for this liveness and > trigger killSession(), here I'm having difficulty connecting the dots. > > could you please give me some help walking through this piece of code? > > Thanks > Yang >
-
Re: help on Zookeeper code walk through?Benjamin Reed 2011-07-17, 00:27
if you are running with multiple servers, it is the leader that
declares sessions dead, so the leader will call killSession(). the followers track the liveness of the clients with pings and will periodically send liveness summaries to the leader. see camille's email the specific classes to look at. ben On Sat, Jul 16, 2011 at 1:44 AM, Yang <[EMAIL PROTECTED]> wrote: > I'm wondering if a client loses session to its ephemeral znode, under > the hood, how > is the watcher triggered? > > went through the code , and found something that looks related: > ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process() > > but how is ZKDataBase.killSession() called? from the info given in > http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#ch_zkSessions > I can see the ZooKeeper client code does periodically ping the server > to maintain liveness. but how the server checks for this liveness and > trigger killSession(), here I'm having difficulty connecting the dots. > > could you please give me some help walking through this piece of code? > > Thanks > Yang >
-
Re: help on Zookeeper code walk through?Yang 2011-07-18, 08:00
Thanks Camille and Ben.
I get the basic picture now. I have another question: in a leader election scenario (for example HBase Master election), I want to make sure that at any time , there is only at most one node running as master, and there is indeed one running as master all the time except for very short failover time period. then if only the connection between current master and ZK is down, ZK senses the lack of pings, and kills the session and ephemeral child node owned by the leader, and the next client node kicks in as leader. at this time, if the current leader machine is still working fine, its traffic going out to the its application servers as normal, would it be blissfully still acting as a leader, and violate our "single master" goal? for example if the Watcher.process() catches the nodeDelete event, and tries to set some var to stop the application server, but if this thread is stopped before the var is set, and is never invoked again, then the application server could just keep happily going along...? for example, the following dummy code class MyApplication { volatile boolean should_stop = false; class MyZKWatcher implements Zookeeper.Watcher { public void process(Event e) { if ( e is nodeDelete of my owner node ) { should_stop = true ; //************* } } public void runApp() { zk = new ZooKeeper(hostPort, 3000, this); while ( ! should_shop ) { send_out_some_messages to my application servers assuming I'm leader } } public static void main(String args[]) { new MyApplication().runApp(); } } basically if the nodeDelete event is caught but the Watcher stops right at "//*****" line , then the application main loop could still be going on?? otherwise I have to put a node exists() check before I send out every application message? Thanks a lot Yang 7 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: > if you are running with multiple servers, it is the leader that > declares sessions dead, so the leader will call killSession(). the > followers track the liveness of the clients with pings and will > periodically send liveness summaries to the leader. > > see camille's email the specific classes to look at. > > ben > > On Sat, Jul 16, 2011 at 1:44 AM, Yang <[EMAIL PROTECTED]> wrote: >> I'm wondering if a client loses session to its ephemeral znode, under >> the hood, how >> is the watcher triggered? >> >> went through the code , and found something that looks related: >> ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process() >> >> but how is ZKDataBase.killSession() called? from the info given in >> http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#ch_zkSessions >> I can see the ZooKeeper client code does periodically ping the server >> to maintain liveness. but how the server checks for this liveness and >> trigger killSession(), here I'm having difficulty connecting the dots. >> >> could you please give me some help walking through this piece of code? >> >> Thanks >> Yang >> >
-
Re: help on Zookeeper code walk through?Fournier, Camille F. 2011-07-18, 13:51
If the zk cluster doesn't get pings from your existing master, the zk client on that master should see a disconnected state event, not a node deletion event. Upon seeing that event, it should stop acting as master until such time as it can determine whether it has reconnected and is still master, or it reconnects and sees that its original session has failed or the master node is deleted.
C ----- Original Message ----- From: Yang <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Sent: Mon Jul 18 04:00:04 2011 Subject: Re: help on Zookeeper code walk through? Thanks Camille and Ben. I get the basic picture now. I have another question: in a leader election scenario (for example HBase Master election), I want to make sure that at any time , there is only at most one node running as master, and there is indeed one running as master all the time except for very short failover time period. then if only the connection between current master and ZK is down, ZK senses the lack of pings, and kills the session and ephemeral child node owned by the leader, and the next client node kicks in as leader. at this time, if the current leader machine is still working fine, its traffic going out to the its application servers as normal, would it be blissfully still acting as a leader, and violate our "single master" goal? for example if the Watcher.process() catches the nodeDelete event, and tries to set some var to stop the application server, but if this thread is stopped before the var is set, and is never invoked again, then the application server could just keep happily going along...? for example, the following dummy code class MyApplication { volatile boolean should_stop = false; class MyZKWatcher implements Zookeeper.Watcher { public void process(Event e) { if ( e is nodeDelete of my owner node ) { should_stop = true ; //************* } } public void runApp() { zk = new ZooKeeper(hostPort, 3000, this); while ( ! should_shop ) { send_out_some_messages to my application servers assuming I'm leader } } public static void main(String args[]) { new MyApplication().runApp(); } } basically if the nodeDelete event is caught but the Watcher stops right at "//*****" line , then the application main loop could still be going on?? otherwise I have to put a node exists() check before I send out every application message? Thanks a lot Yang 7 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: > if you are running with multiple servers, it is the leader that > declares sessions dead, so the leader will call killSession(). the > followers track the liveness of the clients with pings and will > periodically send liveness summaries to the leader. > > see camille's email the specific classes to look at. > > ben > > On Sat, Jul 16, 2011 at 1:44 AM, Yang <[EMAIL PROTECTED]> wrote: >> I'm wondering if a client loses session to its ephemeral znode, under >> the hood, how >> is the watcher triggered? >> >> went through the code , and found something that looks related: >> ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process() >> >> but how is ZKDataBase.killSession() called? from the info given in >> http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#ch_zkSessions >> I can see the ZooKeeper client code does periodically ping the server >> to maintain liveness. but how the server checks for this liveness and >> trigger killSession(), here I'm having difficulty connecting the dots. >> >> could you please give me some help walking through this piece of code? >> >> Thanks >> Yang >> >
-
Re: help on Zookeeper code walk through?Ted Dunning 2011-07-18, 15:52
To amplify slightly, there are a range of possible strategies you can use in
the disconnect scenario. One, is to have a special "master in waiting" state that is entered as soon as a disconnect is seen. This is the most common strategy. A second option is to continue operating as master for a some part of the session expiration period. This is appropriate when not having a master is a really bad thing and you want to minimize the time that this can happen. Of course, if you can't reach ZK, the chances that you can function as a master are somewhat limited. A variant on the second option would be to serve as a read-only master during the time period between the disconnect event and the estimated session expiration. The second and third options imply that you trust your clock which may be a bad assumption in a virtual environment like EC2. On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. < [EMAIL PROTECTED]> wrote: > If the zk cluster doesn't get pings from your existing master, the zk > client on that master should see a disconnected state event, not a node > deletion event. Upon seeing that event, it should stop acting as master > until such time as it can determine whether it has reconnected and is still > master, or it reconnects and sees that its original session has failed or > the master node is deleted. > > C > > > > ----- Original Message ----- > From: Yang <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Sent: Mon Jul 18 04:00:04 2011 > Subject: Re: help on Zookeeper code walk through? > > Thanks Camille and Ben. > > I get the basic picture now. > > I have another question: in a leader election scenario (for example > HBase Master election), I want to make sure that at any time , there > is only at most one node running as master, and there is indeed one > running as master all the time except for very short failover time > period. > > then if only the connection between current master and ZK is down, > ZK senses the lack of pings, and kills the session and ephemeral child > node owned by the leader, and the next client node kicks in as leader. > at this time, if the current leader machine is still working fine, its > traffic going out to the its application servers as normal, would it > be blissfully still acting as a leader, and violate our "single > master" goal? for example if the Watcher.process() catches the > nodeDelete event, and tries to set some var to stop the application > server, but if this thread is stopped before the var is set, and is > never invoked again, then the application server could just keep > happily going along...? > > for example, the following dummy code > > class MyApplication { > volatile boolean should_stop = false; > class MyZKWatcher implements Zookeeper.Watcher { > public void process(Event e) { > if ( e is nodeDelete of my owner node ) { > should_stop = true ; //************* > } > } > > public void runApp() { > zk = new ZooKeeper(hostPort, 3000, this); > while ( ! should_shop ) { > send_out_some_messages to my application servers > assuming I'm leader > } > } > > public static void main(String args[]) { > new MyApplication().runApp(); > } > } > > > basically if the nodeDelete event is caught but the Watcher stops > right at "//*****" line , then the > application main loop could still be going on?? otherwise I have to > put a node exists() check before I send out every application message? > > > Thanks a lot > Yang > 7 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: > > if you are running with multiple servers, it is the leader that > > declares sessions dead, so the leader will call killSession(). the > > followers track the liveness of the clients with pings and will > > periodically send liveness summaries to the leader. > > > > see camille's email the specific classes to look at. > > > > ben
-
Re: help on Zookeeper code walk through?Yang 2011-07-18, 16:02
Thanks Camille.
now I see that it's the Watcher.Event.KeeperState.Disconnected event being generated, by the ClientCnxn.SenderThread.run(). .... queueEvent , and then processed by EventThread.run() .... watcher.process() it seems that the same scenario I gave above could still happen, i.e. the ClientCnxn.SenderThread or the EventThread could be stopped and the main application thread keeps going happily along . though this is a very slight possibility, theoretically it is still possible. or am I missing something? Thanks Yang On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. <[EMAIL PROTECTED]> wrote: > If the zk cluster doesn't get pings from your existing master, the zk client on that master should see a disconnected state event, not a node deletion event. Upon seeing that event, it should stop acting as master until such time as it can determine whether it has reconnected and is still master, or it reconnects and sees that its original session has failed or the master node is deleted. > > C > > > > ----- Original Message ----- > From: Yang <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Sent: Mon Jul 18 04:00:04 2011 > Subject: Re: help on Zookeeper code walk through? > > Thanks Camille and Ben. > > I get the basic picture now. > > I have another question: in a leader election scenario (for example > HBase Master election), I want to make sure that at any time , there > is only at most one node running as master, and there is indeed one > running as master all the time except for very short failover time > period. > > then if only the connection between current master and ZK is down, > ZK senses the lack of pings, and kills the session and ephemeral child > node owned by the leader, and the next client node kicks in as leader. > at this time, if the current leader machine is still working fine, its > traffic going out to the its application servers as normal, would it > be blissfully still acting as a leader, and violate our "single > master" goal? for example if the Watcher.process() catches the > nodeDelete event, and tries to set some var to stop the application > server, but if this thread is stopped before the var is set, and is > never invoked again, then the application server could just keep > happily going along...? > > for example, the following dummy code > > class MyApplication { > volatile boolean should_stop = false; > class MyZKWatcher implements Zookeeper.Watcher { > public void process(Event e) { > if ( e is nodeDelete of my owner node ) { > should_stop = true ; //************* > } > } > > public void runApp() { > zk = new ZooKeeper(hostPort, 3000, this); > while ( ! should_shop ) { > send_out_some_messages to my application servers > assuming I'm leader > } > } > > public static void main(String args[]) { > new MyApplication().runApp(); > } > } > > > basically if the nodeDelete event is caught but the Watcher stops > right at "//*****" line , then the > application main loop could still be going on?? otherwise I have to > put a node exists() check before I send out every application message? > > > Thanks a lot > Yang > 7 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: >> if you are running with multiple servers, it is the leader that >> declares sessions dead, so the leader will call killSession(). the >> followers track the liveness of the clients with pings and will >> periodically send liveness summaries to the leader. >> >> see camille's email the specific classes to look at. >> >> ben >> >> On Sat, Jul 16, 2011 at 1:44 AM, Yang <[EMAIL PROTECTED]> wrote: >>> I'm wondering if a client loses session to its ephemeral znode, under >>> the hood, how >>> is the watcher triggered? >>> >>> went through the code , and found something that looks related: >>> ZKDataBase.killSession()-->DataTree.killSession()--->DataTree.deleteNode()--->WatchManager.triggerWatch()--->Watcher.process()
-
Re: help on Zookeeper code walk through?Yang 2011-07-18, 16:07
Thanks Ted, replied inline.
On Mon, Jul 18, 2011 at 8:52 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: > To amplify slightly, there are a range of possible strategies you can use in > the disconnect scenario. > > One, is to have a special "master in waiting" state that is entered as soon > as a disconnect is seen. This is the most common strategy. > ------- I think my main worry is that the "is seen" can not be done by application thread fast enough, because the ping/event detection is part of the Sender/Event Thread, which can be scheduled arbitrarily later than the main thread. > A second option is to continue operating as master for a some part of the > session expiration period. This is appropriate when not having a master is > a really bad thing and you want to minimize the time that this can happen. > Of course, if you can't reach ZK, the chances that you can function as a > master are somewhat limited. > > A variant on the second option would be to serve as a read-only master > during the time period between the disconnect event and the estimated > session expiration. > > The second and third options imply that you trust your clock which may be a > bad assumption in a virtual environment like EC2. > > On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. < > [EMAIL PROTECTED]> wrote: > >> If the zk cluster doesn't get pings from your existing master, the zk >> client on that master should see a disconnected state event, not a node >> deletion event. Upon seeing that event, it should stop acting as master >> until such time as it can determine whether it has reconnected and is still >> master, or it reconnects and sees that its original session has failed or >> the master node is deleted. >> >> C >> >> >> >> ----- Original Message ----- >> From: Yang <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> >> Sent: Mon Jul 18 04:00:04 2011 >> Subject: Re: help on Zookeeper code walk through? >> >> Thanks Camille and Ben. >> >> I get the basic picture now. >> >> I have another question: in a leader election scenario (for example >> HBase Master election), I want to make sure that at any time , there >> is only at most one node running as master, and there is indeed one >> running as master all the time except for very short failover time >> period. >> >> then if only the connection between current master and ZK is down, >> ZK senses the lack of pings, and kills the session and ephemeral child >> node owned by the leader, and the next client node kicks in as leader. >> at this time, if the current leader machine is still working fine, its >> traffic going out to the its application servers as normal, would it >> be blissfully still acting as a leader, and violate our "single >> master" goal? for example if the Watcher.process() catches the >> nodeDelete event, and tries to set some var to stop the application >> server, but if this thread is stopped before the var is set, and is >> never invoked again, then the application server could just keep >> happily going along...? >> >> for example, the following dummy code >> >> class MyApplication { >> volatile boolean should_stop = false; >> class MyZKWatcher implements Zookeeper.Watcher { >> public void process(Event e) { >> if ( e is nodeDelete of my owner node ) { >> should_stop = true ; //************* >> } >> } >> >> public void runApp() { >> zk = new ZooKeeper(hostPort, 3000, this); >> while ( ! should_shop ) { >> send_out_some_messages to my application servers >> assuming I'm leader >> } >> } >> >> public static void main(String args[]) { >> new MyApplication().runApp(); >> } >> } >> >> >> basically if the nodeDelete event is caught but the Watcher stops >> right at "//*****" line , then the >> application main loop could still be going on?? otherwise I have to >> put a node exists() check before I send out every application message?
-
RE: help on Zookeeper code walk through?Fournier, Camille F. 2011-07-18, 16:30
The client connection sets two values. One is the negotiatedSessionTimeout, which is the time that the server will go before timing out a session it has not received a message from. The second is the readTimeout, which is set after session is established to: readTimeout = negotiatedSessionTimeout * 2 / 3
What should happen is the following: Imagine the client has a network partition from the ZK cluster. It will try to ping the cluster in the readTimeout, and detect that it is disconnected. Your code will then see a disconnected event. Now, you still have 1/3 of the negotiated session timeout before the session is completely timed out. So while you could have some bit of time between when the disconnected event happens and the client sees and acts on it, it should be less than the session timeout, and it is fair for the master to continue to act as the master during this time because no other server should have taken over as the master yet, since the master's lock node still exists on the ZK server. Now, if you are in a debugger let's say, and have paused the event thread while the main thread continues running, then yes, you can continue to incorrectly act as the master. But as long as your timeouts are reasonable, the interval between when the disconnected event happens and the master detects the disconnected event and ceases to act as the master should be long enough that you are never improperly acting as the master because your node will still exist on the ZK. Note that this does mean that the actions inside of your master while loop down there need to be able to complete well within 1/3 of the negotiated session timeout, or else you will act improperly as the master. The only exception to this I can think of in "normal use" is the following: I receive a disconnected event, and before processing it I have a full GC that takes so long that the session was fully expired before it completed. Then, there may possibly be a period after resumption of the VM where I am acting as the master when in fact I haven't processed a disconnected/session expired event. -----Original Message----- From: Yang [mailto:[EMAIL PROTECTED]] Sent: Monday, July 18, 2011 12:02 PM To: [EMAIL PROTECTED] Subject: Re: help on Zookeeper code walk through? Thanks Camille. now I see that it's the Watcher.Event.KeeperState.Disconnected event being generated, by the ClientCnxn.SenderThread.run(). .... queueEvent , and then processed by EventThread.run() .... watcher.process() it seems that the same scenario I gave above could still happen, i.e. the ClientCnxn.SenderThread or the EventThread could be stopped and the main application thread keeps going happily along . though this is a very slight possibility, theoretically it is still possible. or am I missing something? Thanks Yang On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. <[EMAIL PROTECTED]> wrote: > If the zk cluster doesn't get pings from your existing master, the zk client on that master should see a disconnected state event, not a node deletion event. Upon seeing that event, it should stop acting as master until such time as it can determine whether it has reconnected and is still master, or it reconnects and sees that its original session has failed or the master node is deleted. > > C > > > > ----- Original Message ----- > From: Yang <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Sent: Mon Jul 18 04:00:04 2011 > Subject: Re: help on Zookeeper code walk through? > > Thanks Camille and Ben. > > I get the basic picture now. > > I have another question: in a leader election scenario (for example > HBase Master election), I want to make sure that at any time , there > is only at most one node running as master, and there is indeed one > running as master all the time except for very short failover time > period. > > then if only the connection between current master and ZK is down, > ZK senses the lack of pings, and kills the session and ephemeral child
-
Re: help on Zookeeper code walk through?Yang 2011-07-18, 16:39
got it, so
sessionTimeout (which is roughly the grace period before ZKServer elects the next leader) should be guaranteed to be higher than the time needed for application thread to detect the Disconnect event from ZK client. the latter time value can be due to GC pause, thread scheduling delay etc. right? thanks Yang On Mon, Jul 18, 2011 at 9:30 AM, Fournier, Camille F. <[EMAIL PROTECTED]> wrote: > The client connection sets two values. One is the negotiatedSessionTimeout, which is the time that the server will go before timing out a session it has not received a message from. The second is the readTimeout, which is set after session is established to: readTimeout = negotiatedSessionTimeout * 2 / 3 > > What should happen is the following: Imagine the client has a network partition from the ZK cluster. It will try to ping the cluster in the readTimeout, and detect that it is disconnected. Your code will then see a disconnected event. Now, you still have 1/3 of the negotiated session timeout before the session is completely timed out. So while you could have some bit of time between when the disconnected event happens and the client sees and acts on it, it should be less than the session timeout, and it is fair for the master to continue to act as the master during this time because no other server should have taken over as the master yet, since the master's lock node still exists on the ZK server. Now, if you are in a debugger let's say, and have paused the event thread while the main thread continues running, then yes, you can continue to incorrectly act as the master. But as long as your timeouts are reasonable, the interval between when the disconnected event happens and the master detects the disconnected event and ceases to act as the master should be long enough that you are never improperly acting as the master because your node will still exist on the ZK. Note that this does mean that the actions inside of your master while loop down there need to be able to complete well within 1/3 of the negotiated session timeout, or else you will act improperly as the master. > > The only exception to this I can think of in "normal use" is the following: I receive a disconnected event, and before processing it I have a full GC that takes so long that the session was fully expired before it completed. Then, there may possibly be a period after resumption of the VM where I am acting as the master when in fact I haven't processed a disconnected/session expired event. > > > -----Original Message----- > From: Yang [mailto:[EMAIL PROTECTED]] > Sent: Monday, July 18, 2011 12:02 PM > To: [EMAIL PROTECTED] > Subject: Re: help on Zookeeper code walk through? > > Thanks Camille. > > now I see that it's the Watcher.Event.KeeperState.Disconnected event > being generated, > by the ClientCnxn.SenderThread.run(). .... queueEvent , and then > processed by EventThread.run() .... watcher.process() > it seems that the same scenario I gave above could still happen, i.e. > the ClientCnxn.SenderThread or the EventThread could be stopped and > the main application thread keeps going happily along . though this > is a very slight possibility, theoretically it is still possible. > or am I missing something? > > Thanks > Yang > > > On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. > <[EMAIL PROTECTED]> wrote: >> If the zk cluster doesn't get pings from your existing master, the zk client on that master should see a disconnected state event, not a node deletion event. Upon seeing that event, it should stop acting as master until such time as it can determine whether it has reconnected and is still master, or it reconnects and sees that its original session has failed or the master node is deleted. >> >> C >> >> >> >> ----- Original Message ----- >> From: Yang <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> >> Sent: Mon Jul 18 04:00:04 2011 >> Subject: Re: help on Zookeeper code walk through?
-
RE: help on Zookeeper code walk through?Fournier, Camille F. 2011-07-18, 18:38
Yup. Note that GC pause could theoretically cause the unfortunate scenario of a client continuing to act as master when it shouldn't, if your timeout is very low and your GC happens very inconveniently. But for things like context switching, even relatively low timeouts should not cause a problem.
C -----Original Message----- From: Yang [mailto:[EMAIL PROTECTED]] Sent: Monday, July 18, 2011 12:40 PM To: [EMAIL PROTECTED] Subject: Re: help on Zookeeper code walk through? got it, so sessionTimeout (which is roughly the grace period before ZKServer elects the next leader) should be guaranteed to be higher than the time needed for application thread to detect the Disconnect event from ZK client. the latter time value can be due to GC pause, thread scheduling delay etc. right? thanks Yang On Mon, Jul 18, 2011 at 9:30 AM, Fournier, Camille F. <[EMAIL PROTECTED]> wrote: > The client connection sets two values. One is the negotiatedSessionTimeout, which is the time that the server will go before timing out a session it has not received a message from. The second is the readTimeout, which is set after session is established to: readTimeout = negotiatedSessionTimeout * 2 / 3 > > What should happen is the following: Imagine the client has a network partition from the ZK cluster. It will try to ping the cluster in the readTimeout, and detect that it is disconnected. Your code will then see a disconnected event. Now, you still have 1/3 of the negotiated session timeout before the session is completely timed out. So while you could have some bit of time between when the disconnected event happens and the client sees and acts on it, it should be less than the session timeout, and it is fair for the master to continue to act as the master during this time because no other server should have taken over as the master yet, since the master's lock node still exists on the ZK server. Now, if you are in a debugger let's say, and have paused the event thread while the main thread continues running, then yes, you can continue to incorrectly act as the master. But as long as your timeouts are reasonable, the interval between when the disconnected event happens and the master detects the disconnected event and ceases to act as the master should be long enough that you are never improperly acting as the master because your node will still exist on the ZK. Note that this does mean that the actions inside of your master while loop down there need to be able to complete well within 1/3 of the negotiated session timeout, or else you will act improperly as the master. > > The only exception to this I can think of in "normal use" is the following: I receive a disconnected event, and before processing it I have a full GC that takes so long that the session was fully expired before it completed. Then, there may possibly be a period after resumption of the VM where I am acting as the master when in fact I haven't processed a disconnected/session expired event. > > > -----Original Message----- > From: Yang [mailto:[EMAIL PROTECTED]] > Sent: Monday, July 18, 2011 12:02 PM > To: [EMAIL PROTECTED] > Subject: Re: help on Zookeeper code walk through? > > Thanks Camille. > > now I see that it's the Watcher.Event.KeeperState.Disconnected event > being generated, > by the ClientCnxn.SenderThread.run(). .... queueEvent , and then > processed by EventThread.run() .... watcher.process() > it seems that the same scenario I gave above could still happen, i.e. > the ClientCnxn.SenderThread or the EventThread could be stopped and > the main application thread keeps going happily along . though this > is a very slight possibility, theoretically it is still possible. > or am I missing something? > > Thanks > Yang > > > On Mon, Jul 18, 2011 at 6:51 AM, Fournier, Camille F. > <[EMAIL PROTECTED]> wrote: >> If the zk cluster doesn't get pings from your existing master, the zk client on that master should see a disconnected state event, not a node deletion event. Upon seeing that event, it should stop acting as master until such time as it can determine whether it has reconnected and is still master, or it reconnects and sees that its original session has failed or the master node is deleted.
-
Re: help on Zookeeper code walk through?Ted Dunning 2011-07-19, 20:34
But there are a myriad ways to block a program for a short period of time.
For one thing, the thread scheduler in a busy program could easily starve the heartbeat for a short period of time. On Mon, Jul 18, 2011 at 11:38 AM, Fournier, Camille F. < [EMAIL PROTECTED]> wrote: > But for things like context switching, even relatively low timeouts should > not cause a problem. |