|
|
-
Node being there and not at the same time
Mattias Persson 2012-08-23, 10:30
Hi,
I've got a problem that I've seen at only a few occasions and which confuses me a bit. Basically I construct a ZooKeeper client (I'm running version 3.3.2) where there's a ZK quorum of size 3 running. I get a SyncConnected event in a Watcher of mine and in that watcher I do a get-or-create(-if-absent) behaviour where I first do a:
zooKeeper.getData( myPath, false, null );
if that produces a NONODE code I'll try to create it with:
zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT );
If that fails with NODEEXISTS code I'll just get it, assuming someone else made it before me. What I see from this getData call that I do after getting this NODEEXISTS code, which is the same as the first one btw, is that I'll get a NONODE code back. Given in this scenario is that I'm 100% certain that this node exists in the quorum at myPath in the first place even.
Questions: 1) How can this happen? 2) Do I use ZooKeeper here in an improper way? 3) Will a later version fix any potential issue I might have hit? 4) What's the guarantees around the state of my ZooKeeper instance after a receive a SyncConnected event, is it fully synced with the master at that point, or will a call to sync() be necessary first?
Best, Mattias
-- Mattias Persson, [[EMAIL PROTECTED]] Hacker, Neo Technology www.neotechnology.com
+
Mattias Persson 2012-08-23, 10:30
-
Re: Node being there and not at the same time
David Nickerson 2012-08-23, 14:53
It's a little difficult to guess what your application is doing, but it sounds like there's "someone else" who can create and delete the nodes you're trying to work with. So when you create the node and check its data, someone else might have deleted it before you got the chance to check the data. The same is true when you check that it exists and then check the data. You could ensure that the node won't be deleted by using ACLs or giving the node a sequential ephemeral child.
On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson <[EMAIL PROTECTED]>wrote:
> Hi, > > I've got a problem that I've seen at only a few occasions and which > confuses me a bit. Basically I construct a ZooKeeper client (I'm running > version 3.3.2) where there's a ZK quorum of size 3 running. I get a > SyncConnected event in a Watcher of mine and in that watcher I do a > get-or-create(-if-absent) behaviour where I first do a: > > zooKeeper.getData( myPath, false, null ); > > if that produces a NONODE code I'll try to create it with: > > zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT ); > > If that fails with NODEEXISTS code I'll just get it, assuming someone else > made it before me. What I see from this getData call that I do after > getting this NODEEXISTS code, which is the same as the first one btw, is > that I'll get a NONODE code back. Given in this scenario is that I'm 100% > certain that this node exists in the quorum at myPath in the first place > even. > > Questions: > 1) How can this happen? > 2) Do I use ZooKeeper here in an improper way? > 3) Will a later version fix any potential issue I might have hit? > 4) What's the guarantees around the state of my ZooKeeper instance after a > receive a SyncConnected event, is it fully synced with the master at that > point, or will a call to sync() be necessary first? > > Best, > Mattias > > -- > Mattias Persson, [[EMAIL PROTECTED]] > Hacker, Neo Technology > www.neotechnology.com >
+
David Nickerson 2012-08-23, 14:53
-
Re: Node being there and not at the same time
Mattias Persson 2012-08-23, 15:21
Hi David,
There is nowhere in the code where that node gets deleted. If we refrain from that suspicion, could there be something else?
2012/8/23 David Nickerson <[EMAIL PROTECTED]>
> It's a little difficult to guess what your application is doing, but it > sounds like there's "someone else" who can create and delete the nodes > you're trying to work with. So when you create the node and check its data, > someone else might have deleted it before you got the chance to check the > data. The same is true when you check that it exists and then check the > data. You could ensure that the node won't be deleted by using ACLs or > giving the node a sequential ephemeral child. > > On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson > <[EMAIL PROTECTED]>wrote: > > > Hi, > > > > I've got a problem that I've seen at only a few occasions and which > > confuses me a bit. Basically I construct a ZooKeeper client (I'm running > > version 3.3.2) where there's a ZK quorum of size 3 running. I get a > > SyncConnected event in a Watcher of mine and in that watcher I do a > > get-or-create(-if-absent) behaviour where I first do a: > > > > zooKeeper.getData( myPath, false, null ); > > > > if that produces a NONODE code I'll try to create it with: > > > > zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT > ); > > > > If that fails with NODEEXISTS code I'll just get it, assuming someone > else > > made it before me. What I see from this getData call that I do after > > getting this NODEEXISTS code, which is the same as the first one btw, is > > that I'll get a NONODE code back. Given in this scenario is that I'm 100% > > certain that this node exists in the quorum at myPath in the first place > > even. > > > > Questions: > > 1) How can this happen? > > 2) Do I use ZooKeeper here in an improper way? > > 3) Will a later version fix any potential issue I might have hit? > > 4) What's the guarantees around the state of my ZooKeeper instance after > a > > receive a SyncConnected event, is it fully synced with the master at that > > point, or will a call to sync() be necessary first? > > > > Best, > > Mattias > > > > -- > > Mattias Persson, [[EMAIL PROTECTED]] > > Hacker, Neo Technology > > www.neotechnology.com > > >
-- Mattias Persson, [[EMAIL PROTECTED]] Hacker, Neo Technology www.neotechnology.com
+
Mattias Persson 2012-08-23, 15:21
-
Re: Node being there and not at the same time
Bill Bridge 2012-08-25, 00:15
Mattias,
Is it possible that after you get NODEEXISTS from creation and before you do the second getData(), you reconnect to another ZooKeeper instance? If so, maybe the new connection is to a follower that has not yet seen the creation. If this is what is happening, then a sync() after the second NONODE with a third getData() should work. By only doing the sync() when you hit the unusual race condition it will have no performance impact.
Bill
On 8/23/2012 8:21 AM, Mattias Persson wrote: > Hi David, > > There is nowhere in the code where that node gets deleted. If we refrain > from that suspicion, could there be something else? > > 2012/8/23 David Nickerson <[EMAIL PROTECTED]> > >> It's a little difficult to guess what your application is doing, but it >> sounds like there's "someone else" who can create and delete the nodes >> you're trying to work with. So when you create the node and check its data, >> someone else might have deleted it before you got the chance to check the >> data. The same is true when you check that it exists and then check the >> data. You could ensure that the node won't be deleted by using ACLs or >> giving the node a sequential ephemeral child. >> >> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson >> <[EMAIL PROTECTED]>wrote: >> >>> Hi, >>> >>> I've got a problem that I've seen at only a few occasions and which >>> confuses me a bit. Basically I construct a ZooKeeper client (I'm running >>> version 3.3.2) where there's a ZK quorum of size 3 running. I get a >>> SyncConnected event in a Watcher of mine and in that watcher I do a >>> get-or-create(-if-absent) behaviour where I first do a: >>> >>> zooKeeper.getData( myPath, false, null ); >>> >>> if that produces a NONODE code I'll try to create it with: >>> >>> zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT >> ); >>> If that fails with NODEEXISTS code I'll just get it, assuming someone >> else >>> made it before me. What I see from this getData call that I do after >>> getting this NODEEXISTS code, which is the same as the first one btw, is >>> that I'll get a NONODE code back. Given in this scenario is that I'm 100% >>> certain that this node exists in the quorum at myPath in the first place >>> even. >>> >>> Questions: >>> 1) How can this happen? >>> 2) Do I use ZooKeeper here in an improper way? >>> 3) Will a later version fix any potential issue I might have hit? >>> 4) What's the guarantees around the state of my ZooKeeper instance after >> a >>> receive a SyncConnected event, is it fully synced with the master at that >>> point, or will a call to sync() be necessary first? >>> >>> Best, >>> Mattias >>> >>> -- >>> Mattias Persson, [[EMAIL PROTECTED]] >>> Hacker, Neo Technology >>> www.neotechnology.com >>> > >
+
Bill Bridge 2012-08-25, 00:15
-
Re: Node being there and not at the same time
Alexander Shraer 2012-08-25, 01:11
Bill, if I understand correctly this shouldn't be possible - the client will not be able to connect to a server that is less up-to-date than that same client. So if the create completed at the client before it disconnects the new server will have to know about it too otherwise the connection will fail. See Leader.waitForEpochAck:
if (ss.isMoreRecentThan(leaderStateSummary)) { throw new IOException("Follower is ahead of the leader, leader summary: " + leaderStateSummary.getCurrentEpoch() + " (current epoch), " + leaderStateSummary.getLastZxid() + " (last zxid)"); }
of course its possible that another client connected to a different server doesn't see the create.
Alex On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <[EMAIL PROTECTED]> wrote: > Mattias, > > Is it possible that after you get NODEEXISTS from creation and before you do > the second getData(), you reconnect to another ZooKeeper instance? If so, > maybe the new connection is to a follower that has not yet seen the > creation. If this is what is happening, then a sync() after the second > NONODE with a third getData() should work. By only doing the sync() when you > hit the unusual race condition it will have no performance impact. > > Bill > > > On 8/23/2012 8:21 AM, Mattias Persson wrote: >> >> Hi David, >> >> There is nowhere in the code where that node gets deleted. If we refrain >> from that suspicion, could there be something else? >> >> 2012/8/23 David Nickerson <[EMAIL PROTECTED]> >> >>> It's a little difficult to guess what your application is doing, but it >>> sounds like there's "someone else" who can create and delete the nodes >>> you're trying to work with. So when you create the node and check its >>> data, >>> someone else might have deleted it before you got the chance to check the >>> data. The same is true when you check that it exists and then check the >>> data. You could ensure that the node won't be deleted by using ACLs or >>> giving the node a sequential ephemeral child. >>> >>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson >>> <[EMAIL PROTECTED]>wrote: >>> >>>> Hi, >>>> >>>> I've got a problem that I've seen at only a few occasions and which >>>> confuses me a bit. Basically I construct a ZooKeeper client (I'm running >>>> version 3.3.2) where there's a ZK quorum of size 3 running. I get a >>>> SyncConnected event in a Watcher of mine and in that watcher I do a >>>> get-or-create(-if-absent) behaviour where I first do a: >>>> >>>> zooKeeper.getData( myPath, false, null ); >>>> >>>> if that produces a NONODE code I'll try to create it with: >>>> >>>> zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT >>> >>> ); >>>> >>>> If that fails with NODEEXISTS code I'll just get it, assuming someone >>> >>> else >>>> >>>> made it before me. What I see from this getData call that I do after >>>> getting this NODEEXISTS code, which is the same as the first one btw, is >>>> that I'll get a NONODE code back. Given in this scenario is that I'm >>>> 100% >>>> certain that this node exists in the quorum at myPath in the first place >>>> even. >>>> >>>> Questions: >>>> 1) How can this happen? >>>> 2) Do I use ZooKeeper here in an improper way? >>>> 3) Will a later version fix any potential issue I might have hit? >>>> 4) What's the guarantees around the state of my ZooKeeper instance after >>> >>> a >>>> >>>> receive a SyncConnected event, is it fully synced with the master at >>>> that >>>> point, or will a call to sync() be necessary first? >>>> >>>> Best, >>>> Mattias >>>> >>>> -- >>>> Mattias Persson, [[EMAIL PROTECTED]] >>>> Hacker, Neo Technology >>>> www.neotechnology.com >>>> >> >> >
+
Alexander Shraer 2012-08-25, 01:11
-
Re: Node being there and not at the same time
Bill Bridge 2012-08-27, 17:22
Alex, You certainly know the code much better than I, so I may be mistaken here. It looks to me like waitForEpochAck() is about changes in the set of peers, and is not related to client connect/disconnects. I do not see how this would be called if a client disconnected due to some problem of his own, such as too slow to heartbeat, then reconnected to a different peer or observer.
You suggest that a reconnecting client should ensure the new server has seen all transactions that the client has seen. This sounds like the right thing to do. This would certainly eliminate the race condition I postulated. This sounds like the kind of thing someone would have already thought of. If this is not already done then it would be a good change to make. I do not know where the code to do that would be. It could be part of the server reconnect code or it could be a sync() in the client library.
If Mattias's code creates a new session when reconnecting, rather than reconnecting to the same session, then he could have the problem described even if reconnect ensures the client is not ahead of the server. He could fix this either by reconnecting to the same session, or simply doing a sync() when necessary.
Thanks, Bill
On 8/24/2012 6:11 PM, Alexander Shraer wrote: > Bill, if I understand correctly this shouldn't be possible - the > client will not be able to connect to a server that is > less up-to-date than that same client. So if the create completed at > the client before it disconnects the new server will have to know > about it too otherwise the connection will fail. See > Leader.waitForEpochAck: > > if (ss.isMoreRecentThan(leaderStateSummary)) { > throw new IOException("Follower is ahead of the > leader, leader summary: " > + > leaderStateSummary.getCurrentEpoch() > + " (current epoch), " > + > leaderStateSummary.getLastZxid() > + " (last zxid)"); > } > > of course its possible that another client connected to a different > server doesn't see the create. > > Alex > > > On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <[EMAIL PROTECTED]> wrote: >> Mattias, >> >> Is it possible that after you get NODEEXISTS from creation and before you do >> the second getData(), you reconnect to another ZooKeeper instance? If so, >> maybe the new connection is to a follower that has not yet seen the >> creation. If this is what is happening, then a sync() after the second >> NONODE with a third getData() should work. By only doing the sync() when you >> hit the unusual race condition it will have no performance impact. >> >> Bill >> >> >> On 8/23/2012 8:21 AM, Mattias Persson wrote: >>> Hi David, >>> >>> There is nowhere in the code where that node gets deleted. If we refrain >>> from that suspicion, could there be something else? >>> >>> 2012/8/23 David Nickerson <[EMAIL PROTECTED]> >>> >>>> It's a little difficult to guess what your application is doing, but it >>>> sounds like there's "someone else" who can create and delete the nodes >>>> you're trying to work with. So when you create the node and check its >>>> data, >>>> someone else might have deleted it before you got the chance to check the >>>> data. The same is true when you check that it exists and then check the >>>> data. You could ensure that the node won't be deleted by using ACLs or >>>> giving the node a sequential ephemeral child. >>>> >>>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson >>>> <[EMAIL PROTECTED]>wrote: >>>> >>>>> Hi, >>>>> >>>>> I've got a problem that I've seen at only a few occasions and which >>>>> confuses me a bit. Basically I construct a ZooKeeper client (I'm running >>>>> version 3.3.2) where there's a ZK quorum of size 3 running. I get a >>>>> SyncConnected event in a Watcher of mine and in that watcher I do a
+
Bill Bridge 2012-08-27, 17:22
-
Re: Node being there and not at the same time
Alexander Shraer 2012-08-27, 17:40
Hi Bill,
agreed - if the client's session expires than this is possible. Although I don't believe that this is what's happening here since peers usually catch up on commits really quickly while session expiration does take some time, so its unlikely that after expiration the client reconnects and there is a peer that is still less up-to-date. More likely that he's creating a new client handle or some other issue as Camille suggests.
Thanks, Alex
On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[EMAIL PROTECTED]> wrote: > Alex, > You certainly know the code much better than I, so I may be mistaken here. > It looks to me like waitForEpochAck() is about changes in the set of peers, > and is not related to client connect/disconnects. I do not see how this > would be called if a client disconnected due to some problem of his own, > such as too slow to heartbeat, then reconnected to a different peer or > observer. > > You suggest that a reconnecting client should ensure the new server has seen > all transactions that the client has seen. This sounds like the right thing > to do. This would certainly eliminate the race condition I postulated. This > sounds like the kind of thing someone would have already thought of. If this > is not already done then it would be a good change to make. I do not know > where the code to do that would be. It could be part of the server reconnect > code or it could be a sync() in the client library. > > If Mattias's code creates a new session when reconnecting, rather than > reconnecting to the same session, then he could have the problem described > even if reconnect ensures the client is not ahead of the server. He could > fix this either by reconnecting to the same session, or simply doing a > sync() when necessary. > > Thanks, > Bill > > > On 8/24/2012 6:11 PM, Alexander Shraer wrote: >> >> Bill, if I understand correctly this shouldn't be possible - the >> client will not be able to connect to a server that is >> less up-to-date than that same client. So if the create completed at >> the client before it disconnects the new server will have to know >> about it too otherwise the connection will fail. See >> Leader.waitForEpochAck: >> >> if (ss.isMoreRecentThan(leaderStateSummary)) { >> throw new IOException("Follower is ahead of the >> leader, leader summary: " >> + >> leaderStateSummary.getCurrentEpoch() >> + " (current epoch), >> " >> + >> leaderStateSummary.getLastZxid() >> + " (last zxid)"); >> } >> >> of course its possible that another client connected to a different >> server doesn't see the create. >> >> Alex >> >> >> On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <[EMAIL PROTECTED]> >> wrote: >>> >>> Mattias, >>> >>> Is it possible that after you get NODEEXISTS from creation and before you >>> do >>> the second getData(), you reconnect to another ZooKeeper instance? If so, >>> maybe the new connection is to a follower that has not yet seen the >>> creation. If this is what is happening, then a sync() after the second >>> NONODE with a third getData() should work. By only doing the sync() when >>> you >>> hit the unusual race condition it will have no performance impact. >>> >>> Bill >>> >>> >>> On 8/23/2012 8:21 AM, Mattias Persson wrote: >>>> >>>> Hi David, >>>> >>>> There is nowhere in the code where that node gets deleted. If we refrain >>>> from that suspicion, could there be something else? >>>> >>>> 2012/8/23 David Nickerson <[EMAIL PROTECTED]> >>>> >>>>> It's a little difficult to guess what your application is doing, but it >>>>> sounds like there's "someone else" who can create and delete the nodes >>>>> you're trying to work with. So when you create the node and check its >>>>> data, >>>>> someone else might have deleted it before you got the chance to check
+
Alexander Shraer 2012-08-27, 17:40
-
Re: Node being there and not at the same time
Alexander Shraer 2012-08-31, 05:21
Bill,
I'm sorry - you were right and I totally quoted the wrong place in the code. The code that ensures that a client doesn't "go back in time" by connecting to a server that is less up to date than that client is most probably this one from ZooKeeperServer.java. I realized it after looking on the question of Simon today in the mailing list...
if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid)
String msg = "Refusing session request for client "
+ cnxn.getRemoteSocketAddress()
+ " as it has seen zxid 0x"
+ Long.toHexString(connReq.getLastZxidSeen())
+ " our last zxid is 0x"
+ Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())
+ " client must try another server";
On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[EMAIL PROTECTED]>wrote:
> Alex, > You certainly know the code much better than I, so I may be mistaken here. > It looks to me like waitForEpochAck() is about changes in the set of peers, > and is not related to client connect/disconnects. I do not see how this > would be called if a client disconnected due to some problem of his own, > such as too slow to heartbeat, then reconnected to a different peer or > observer. > > You suggest that a reconnecting client should ensure the new server has > seen all transactions that the client has seen. This sounds like the right > thing to do. This would certainly eliminate the race condition I > postulated. This sounds like the kind of thing someone would have already > thought of. If this is not already done then it would be a good change to > make. I do not know where the code to do that would be. It could be part of > the server reconnect code or it could be a sync() in the client library. > > If Mattias's code creates a new session when reconnecting, rather than > reconnecting to the same session, then he could have the problem described > even if reconnect ensures the client is not ahead of the server. He could > fix this either by reconnecting to the same session, or simply doing a > sync() when necessary. > > Thanks, > Bill > > > On 8/24/2012 6:11 PM, Alexander Shraer wrote: > >> Bill, if I understand correctly this shouldn't be possible - the >> client will not be able to connect to a server that is >> less up-to-date than that same client. So if the create completed at >> the client before it disconnects the new server will have to know >> about it too otherwise the connection will fail. See >> Leader.waitForEpochAck: >> >> if (ss.isMoreRecentThan(**leaderStateSummary)) { >> throw new IOException("Follower is ahead of the >> leader, leader summary: " >> + >> leaderStateSummary.**getCurrentEpoch() >> + " (current epoch), >> " >> + >> leaderStateSummary.**getLastZxid() >> + " (last zxid)"); >> } >> >> of course its possible that another client connected to a different >> server doesn't see the create. >> >> Alex >> >> >> On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <[EMAIL PROTECTED]> >> wrote: >> >>> Mattias, >>> >>> Is it possible that after you get NODEEXISTS from creation and before >>> you do >>> the second getData(), you reconnect to another ZooKeeper instance? If so, >>> maybe the new connection is to a follower that has not yet seen the >>> creation. If this is what is happening, then a sync() after the second >>> NONODE with a third getData() should work. By only doing the sync() when >>> you >>> hit the unusual race condition it will have no performance impact. >>> >>> Bill >>> >>> >>> On 8/23/2012 8:21 AM, Mattias Persson wrote: >>> >>>> Hi David, >>>> >>>> There is nowhere in the code where that node gets deleted. If we refrain >>>> from that suspicion, could there be something else?
+
Alexander Shraer 2012-08-31, 05:21
-
Re: Node being there and not at the same time
Bill Bridge 2012-08-31, 05:50
Nothing to be sorry about, I was wrong to suggest a client could see an old state by reconnecting. When you said that it should not be allowed I realized that had to be the case. I saw that email too and realized it had something to do with this subject.
It would seem nicer to simply do a sync() when this happens rather than refusing the connection. We could destroy the connection if the client is still in the future after a sync(). There is something seriously wrong if the client is still in the future after a sync(). If this happened with the current code the client would just keep trying until the connection finally worked and we would not find out that something is wrong. I suppose the client's last zxid could have been corrupted in his memory causing this problem. It would be good to have this disconnect and fail the client rather than spin.
Without the connection you cannot do the sync() yourself. It is conceivable that it will be a few seconds before there is another server that is current enough to connect with. Maybe the other servers are in different data centers and would not be efficient to connect to them.
Bill On 8/30/2012 10:21 PM, Alexander Shraer wrote: > Bill, > > I'm sorry - you were right and I totally quoted the wrong place in the > code. The code that ensures that a client doesn't "go back in time" by > connecting to a server that is less up to date than that client is > most probably this one from ZooKeeperServer.java. I realized it after > looking on the question of Simon today in the mailing list... > > if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid) > > String msg = "Refusing session request for client " > > + cnxn.getRemoteSocketAddress() > > + " as it has seen zxid 0x" > > + Long.toHexString(connReq.getLastZxidSeen()) > > + " our last zxid is 0x" > > + > Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid()) > > + " client must try another server"; > > > On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Alex, > You certainly know the code much better than I, so I may be > mistaken here. It looks to me like waitForEpochAck() is about > changes in the set of peers, and is not related to client > connect/disconnects. I do not see how this would be called if a > client disconnected due to some problem of his own, such as too > slow to heartbeat, then reconnected to a different peer or observer. > > You suggest that a reconnecting client should ensure the new > server has seen all transactions that the client has seen. This > sounds like the right thing to do. This would certainly eliminate > the race condition I postulated. This sounds like the kind of > thing someone would have already thought of. If this is not > already done then it would be a good change to make. I do not know > where the code to do that would be. It could be part of the server > reconnect code or it could be a sync() in the client library. > > If Mattias's code creates a new session when reconnecting, rather > than reconnecting to the same session, then he could have the > problem described even if reconnect ensures the client is not > ahead of the server. He could fix this either by reconnecting to > the same session, or simply doing a sync() when necessary. > > Thanks, > Bill > > > On 8/24/2012 6:11 PM, Alexander Shraer wrote: > > Bill, if I understand correctly this shouldn't be possible - the > client will not be able to connect to a server that is > less up-to-date than that same client. So if the create > completed at > the client before it disconnects the new server will have to know > about it too otherwise the connection will fail. See > Leader.waitForEpochAck:
+
Bill Bridge 2012-08-31, 05:50
-
Re: Node being there and not at the same time
Alexander Shraer 2012-08-31, 06:04
This sounds like a good idea. I'm not sure how easy it would be to implement as the client may need to be in a new sort of "conditional" state.
Alex
On Thu, Aug 30, 2012 at 10:50 PM, Bill Bridge <[EMAIL PROTECTED]>wrote:
> Nothing to be sorry about, I was wrong to suggest a client could see an > old state by reconnecting. When you said that it should not be allowed I > realized that had to be the case. I saw that email too and realized it had > something to do with this subject. > > It would seem nicer to simply do a sync() when this happens rather than > refusing the connection. We could destroy the connection if the client is > still in the future after a sync(). There is something seriously wrong if > the client is still in the future after a sync(). If this happened with the > current code the client would just keep trying until the connection finally > worked and we would not find out that something is wrong. I suppose the > client's last zxid could have been corrupted in his memory causing this > problem. It would be good to have this disconnect and fail the client > rather than spin. > > Without the connection you cannot do the sync() yourself. It is > conceivable that it will be a few seconds before there is another server > that is current enough to connect with. Maybe the other servers are in > different data centers and would not be efficient to connect to them. > > Bill > > On 8/30/2012 10:21 PM, Alexander Shraer wrote: > > Bill, > > I'm sorry - you were right and I totally quoted the wrong place in the > code. The code that ensures that a client doesn't "go back in time" by > connecting to a server that is less up to date than that client is most > probably this one from ZooKeeperServer.java. I realized it after looking on > the question of Simon today in the mailing list... > > if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid) > > String msg = "Refusing session request for client " > > + cnxn.getRemoteSocketAddress() > > + " as it has seen zxid 0x" > > + Long.toHexString(connReq.getLastZxidSeen()) > > + " our last zxid is 0x" > > + > Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid()) > > + " client must try another server"; > > On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[EMAIL PROTECTED]>wrote: > >> Alex, >> You certainly know the code much better than I, so I may be mistaken >> here. It looks to me like waitForEpochAck() is about changes in the set of >> peers, and is not related to client connect/disconnects. I do not see how >> this would be called if a client disconnected due to some problem of his >> own, such as too slow to heartbeat, then reconnected to a different peer or >> observer. >> >> You suggest that a reconnecting client should ensure the new server has >> seen all transactions that the client has seen. This sounds like the right >> thing to do. This would certainly eliminate the race condition I >> postulated. This sounds like the kind of thing someone would have already >> thought of. If this is not already done then it would be a good change to >> make. I do not know where the code to do that would be. It could be part of >> the server reconnect code or it could be a sync() in the client library. >> >> If Mattias's code creates a new session when reconnecting, rather than >> reconnecting to the same session, then he could have the problem described >> even if reconnect ensures the client is not ahead of the server. He could >> fix this either by reconnecting to the same session, or simply doing a >> sync() when necessary. >> >> Thanks, >> Bill >> >> >> On 8/24/2012 6:11 PM, Alexander Shraer wrote: >> >>> Bill, if I understand correctly this shouldn't be possible - the >>> client will not be able to connect to a server that is >>> less up-to-date than that same client. So if the create completed at >>> the client before it disconnects the new server will have to know
+
Alexander Shraer 2012-08-31, 06:04
-
Re: Node being there and not at the same time
Mattias Persson 2012-08-31, 07:00
Thanks for your great feedback. I'll find out more about any reconnects around that time and may post some more questions with some code if there still seems to be problems.
Best, Mattias
2012/8/31 Alexander Shraer <[EMAIL PROTECTED]>
> This sounds like a good idea. I'm not sure how easy it would be to > implement as the client may need to be in a new sort of "conditional" > state. > > Alex > > On Thu, Aug 30, 2012 at 10:50 PM, Bill Bridge <[EMAIL PROTECTED] > >wrote: > > > Nothing to be sorry about, I was wrong to suggest a client could see an > > old state by reconnecting. When you said that it should not be allowed I > > realized that had to be the case. I saw that email too and realized it > had > > something to do with this subject. > > > > It would seem nicer to simply do a sync() when this happens rather than > > refusing the connection. We could destroy the connection if the client is > > still in the future after a sync(). There is something seriously wrong if > > the client is still in the future after a sync(). If this happened with > the > > current code the client would just keep trying until the connection > finally > > worked and we would not find out that something is wrong. I suppose the > > client's last zxid could have been corrupted in his memory causing this > > problem. It would be good to have this disconnect and fail the client > > rather than spin. > > > > Without the connection you cannot do the sync() yourself. It is > > conceivable that it will be a few seconds before there is another server > > that is current enough to connect with. Maybe the other servers are in > > different data centers and would not be efficient to connect to them. > > > > Bill > > > > On 8/30/2012 10:21 PM, Alexander Shraer wrote: > > > > Bill, > > > > I'm sorry - you were right and I totally quoted the wrong place in the > > code. The code that ensures that a client doesn't "go back in time" by > > connecting to a server that is less up to date than that client is most > > probably this one from ZooKeeperServer.java. I realized it after looking > on > > the question of Simon today in the mailing list... > > > > if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid) > > > > String msg = "Refusing session request for client " > > > > + cnxn.getRemoteSocketAddress() > > > > + " as it has seen zxid 0x" > > > > + Long.toHexString(connReq.getLastZxidSeen()) > > > > + " our last zxid is 0x" > > > > + > > Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid()) > > > > + " client must try another server"; > > > > On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <[EMAIL PROTECTED] > >wrote: > > > >> Alex, > >> You certainly know the code much better than I, so I may be mistaken > >> here. It looks to me like waitForEpochAck() is about changes in the set > of > >> peers, and is not related to client connect/disconnects. I do not see > how > >> this would be called if a client disconnected due to some problem of his > >> own, such as too slow to heartbeat, then reconnected to a different > peer or > >> observer. > >> > >> You suggest that a reconnecting client should ensure the new server has > >> seen all transactions that the client has seen. This sounds like the > right > >> thing to do. This would certainly eliminate the race condition I > >> postulated. This sounds like the kind of thing someone would have > already > >> thought of. If this is not already done then it would be a good change > to > >> make. I do not know where the code to do that would be. It could be > part of > >> the server reconnect code or it could be a sync() in the client library. > >> > >> If Mattias's code creates a new session when reconnecting, rather than > >> reconnecting to the same session, then he could have the problem > described > >> even if reconnect ensures the client is not ahead of the server. He > could
Mattias Persson, [[EMAIL PROTECTED]] Hacker, Neo Technology www.neotechnology.com
+
Mattias Persson 2012-08-31, 07:00
-
Re: Node being there and not at the same time
Camille Fournier 2012-08-25, 01:17
In my experience helping people with ZK, this sort of thing is almost always due to a bug in the client's code. If you want to share your code with us we might be able to help, but I strongly suspect you're just not seeing some edge case of the way you've written your code that is causing this behavior. However, it certainly wouldn't hurt to upgrade to the latest 3.3.X version, if for no other reason than they do generally get better every stable release.
C
On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson <[EMAIL PROTECTED]>wrote:
> Hi, > > I've got a problem that I've seen at only a few occasions and which > confuses me a bit. Basically I construct a ZooKeeper client (I'm running > version 3.3.2) where there's a ZK quorum of size 3 running. I get a > SyncConnected event in a Watcher of mine and in that watcher I do a > get-or-create(-if-absent) behaviour where I first do a: > > zooKeeper.getData( myPath, false, null ); > > if that produces a NONODE code I'll try to create it with: > > zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT ); > > If that fails with NODEEXISTS code I'll just get it, assuming someone else > made it before me. What I see from this getData call that I do after > getting this NODEEXISTS code, which is the same as the first one btw, is > that I'll get a NONODE code back. Given in this scenario is that I'm 100% > certain that this node exists in the quorum at myPath in the first place > even. > > Questions: > 1) How can this happen? > 2) Do I use ZooKeeper here in an improper way? > 3) Will a later version fix any potential issue I might have hit? > 4) What's the guarantees around the state of my ZooKeeper instance after a > receive a SyncConnected event, is it fully synced with the master at that > point, or will a call to sync() be necessary first? > > Best, > Mattias > > -- > Mattias Persson, [[EMAIL PROTECTED]] > Hacker, Neo Technology > www.neotechnology.com >
+
Camille Fournier 2012-08-25, 01:17
|
|