|
Jun Rao
2011-09-25, 18:54
Camille Fournier
2011-09-25, 19:56
kishore g
2011-09-26, 05:06
kishore g
2011-09-27, 20:54
Fournier, Camille F.
2011-09-27, 21:20
Mahadev Konar
2011-09-27, 23:26
Camille Fournier
2011-09-28, 00:42
kishore g
2011-09-28, 04:38
Mahadev Konar
2011-09-29, 07:20
|
-
ephemeral node not removed after the client session is long goneJun Rao 2011-09-25, 18:54
Hi,
We found our ZK server in a state where an ephemeral node still exists after a client session is long gone. I used the cons command on each ZK host to list all connections and couldn't find the ephemeralOwner id. We are using ZK 3.3.3. Has anyone seen this problem? Thanks, Jun
-
Re: ephemeral node not removed after the client session is long goneCamille Fournier 2011-09-25, 19:56
Can you give us more details, like the information on the client that
created it, any logs around the time they created it/disconnected, etc? On Sun, Sep 25, 2011 at 2:54 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > Hi, > > We found our ZK server in a state where an ephemeral node still exists after > a client session is long gone. I used the cons command on each ZK host to > list all connections and couldn't find the ephemeralOwner id. We are using > ZK 3.3.3. Has anyone seen this problem? > > Thanks, > > Jun >
-
Re: ephemeral node not removed after the client session is long gonekishore g 2011-09-26, 05:06
Hi,
I got the following information from the logs. The node that still exists is /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 I saw that the ephemeral owner is 86167322861045079 which is session id 0x13220b93e610550. After searching in the transaction log of one of the ZK servers found that session expired 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 closeSession null On digging further into the logs I found that there were multiple sessions created in quick succession and every session tried to create the same node. But i verified that the sessions were closed and opened in order 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 createSession 6000 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 closeSession null 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 createSession 6000 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid 0x601bd373a closeSession null 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e createSession 6000 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid 0x601bd37a0 closeSession null 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 createSession 6000 9/22/11 12:18:03 PM PDT session 0x13220b93e610553 cxid 0x74 zxid 0x601bd382b closeSession null 9/22/11 12:18:04 PM PDT session 0x13220b93e610554 cxid 0x0 zxid 0x601bd383c createSession 6000 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x6a zxid 0x601bd388f closeSession null 9/22/11 12:18:06 PM PDT session 0x13220b93e610555 cxid 0x0 zxid 0x601bd3895 createSession 6000 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x6a zxid 0x601bd38cd closeSession null 9/22/11 12:18:10 PM PDT session 0x13220b93e610556 cxid 0x0 zxid 0x601bd38d1 createSession 6000 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x0 zxid 0x601bd38f2 createSession 6000 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x51 zxid 0x601bd396a closeSession null Here is the log output for the sessions that tried creating the same node 9/22/11 12:17:54 PM PDT session 0x13220b93e61054f cxid 0x42 zxid 0x601bd366b create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x42 zxid 0x601bd36ce create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x42 zxid 0x601bd3711 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x42 zxid 0x601bd3777 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x42 zxid 0x601bd3802 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x44 zxid 0x601bd385d create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x44 zxid 0x601bd38b0 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x52 zxid 0x601bd396b create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 Let me know if you need additional information. Thanks, Kishore G On Sun, Sep 25, 2011 at 12:56 PM, Camille Fournier <[EMAIL PROTECTED]>wrote: > Can you give us more details, like the information on the client that > created it, any logs around the time they created it/disconnected, > etc? > > On Sun, Sep 25, 2011 at 2:54 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > Hi, > > > > We found our ZK server in a state where an ephemeral node still exists
-
Re: ephemeral node not removed after the client session is long gonekishore g 2011-09-27, 20:54
Any idea on why this is happened. I tried reproducing this by doing the
following repeatedly * create a session * create ephemeral node * close the session Could not reproduce the issue. We can provide additional details if needed thanks, Kishore G On Sun, Sep 25, 2011 at 10:06 PM, kishore g <[EMAIL PROTECTED]> wrote: > Hi, > > I got the following information from the logs. > > The node that still exists is > /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 > > I saw that the ephemeral owner is 86167322861045079 which is session id > 0x13220b93e610550. > > After searching in the transaction log of one of the ZK servers found that > session expired > > 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid > 0x601bd36f7 closeSession null > > On digging further into the logs I found that there were multiple sessions > created in quick succession and every session tried to create the same node. > But i verified that the sessions were closed and opened in order > 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 > createSession 6000 > 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid > 0x601bd36f7 closeSession null > 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 > createSession 6000 > 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid > 0x601bd373a closeSession null > 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e > createSession 6000 > 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid > 0x601bd37a0 closeSession null > 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 > createSession 6000 > 9/22/11 12:18:03 PM PDT session 0x13220b93e610553 cxid 0x74 zxid > 0x601bd382b closeSession null > 9/22/11 12:18:04 PM PDT session 0x13220b93e610554 cxid 0x0 zxid 0x601bd383c > createSession 6000 > 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x6a zxid > 0x601bd388f closeSession null > 9/22/11 12:18:06 PM PDT session 0x13220b93e610555 cxid 0x0 zxid 0x601bd3895 > createSession 6000 > 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x6a zxid > 0x601bd38cd closeSession null > 9/22/11 12:18:10 PM PDT session 0x13220b93e610556 cxid 0x0 zxid 0x601bd38d1 > createSession 6000 > 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x0 zxid 0x601bd38f2 > createSession 6000 > 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x51 zxid > 0x601bd396a closeSession null > > Here is the log output for the sessions that tried creating the same node > > 9/22/11 12:17:54 PM PDT session 0x13220b93e61054f cxid 0x42 zxid > 0x601bd366b create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x42 zxid > 0x601bd36ce create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x42 zxid > 0x601bd3711 create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x42 zxid > 0x601bd3777 create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x42 zxid > 0x601bd3802 create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x44 zxid > 0x601bd385d create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x44 zxid > 0x601bd38b0 create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x52 zxid > 0x601bd396b create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7
-
RE: ephemeral node not removed after the client session is long goneFournier, Camille F. 2011-09-27, 21:20
So, the node was created by 0x13220b93e610550 at 12:17:56, then that session closed at 12:17:57, the node did not delete, and a bunch of other sessions later tried to create the node. These sessions got nodeexists failures I presume?
Forgive the block of text I'm going to write instead of code: I'm going to bet that the problem lies in PrepRequestProcessor. If we get the ephemerals for the session while an ephemeral is still in outstandingChanges and has not been committed, then another thread commits that ephemeral and removes it from outstanding changes before synchronizing in the outstandingChagnes block, we could never put it in the ephemeral set that we are using to reflect ephemerals to delete. I think we need to move the synchronized block up before we get the ephemerals from the database. But this is a bit of speculation at the moment. Can you create a JIRA tracker for me to look at this? Thanks, C -----Original Message----- From: kishore g [mailto:[EMAIL PROTECTED]] Sent: Monday, September 26, 2011 1:07 AM To: [EMAIL PROTECTED] Subject: Re: ephemeral node not removed after the client session is long gone Hi, I got the following information from the logs. The node that still exists is /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 I saw that the ephemeral owner is 86167322861045079 which is session id 0x13220b93e610550. After searching in the transaction log of one of the ZK servers found that session expired 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 closeSession null On digging further into the logs I found that there were multiple sessions created in quick succession and every session tried to create the same node. But i verified that the sessions were closed and opened in order 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 createSession 6000 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 closeSession null 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 createSession 6000 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid 0x601bd373a closeSession null 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e createSession 6000 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid 0x601bd37a0 closeSession null 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 createSession 6000 9/22/11 12:18:03 PM PDT session 0x13220b93e610553 cxid 0x74 zxid 0x601bd382b closeSession null 9/22/11 12:18:04 PM PDT session 0x13220b93e610554 cxid 0x0 zxid 0x601bd383c createSession 6000 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x6a zxid 0x601bd388f closeSession null 9/22/11 12:18:06 PM PDT session 0x13220b93e610555 cxid 0x0 zxid 0x601bd3895 createSession 6000 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x6a zxid 0x601bd38cd closeSession null 9/22/11 12:18:10 PM PDT session 0x13220b93e610556 cxid 0x0 zxid 0x601bd38d1 createSession 6000 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x0 zxid 0x601bd38f2 createSession 6000 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x51 zxid 0x601bd396a closeSession null Here is the log output for the sessions that tried creating the same node 9/22/11 12:17:54 PM PDT session 0x13220b93e61054f cxid 0x42 zxid 0x601bd366b create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x42 zxid 0x601bd36ce create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x42 zxid 0x601bd3711 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x42 zxid 0x601bd3777 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x42 zxid 0x601bd3802 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x44 zxid 0x601bd385d create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x44 zxid 0x601bd38b0 create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x52 zxid 0x601bd396b create '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 Let me know if you need additional information. Thanks, Kishore G On Sun, Sep 25, 2011 at 12:56 PM, Camille Fournier <[EMAIL PROTECTED]>wrote:
-
Re: ephemeral node not removed after the client session is long goneMahadev Konar 2011-09-27, 23:26
Camille,
I am a little confused on the explanation/text. Probably might want to update the jira (if kishore opens one) with a little more detail. thanks mahadev On Tue, Sep 27, 2011 at 2:20 PM, Fournier, Camille F. <[EMAIL PROTECTED]> wrote: > So, the node was created by 0x13220b93e610550 at 12:17:56, then that session closed at 12:17:57, the node did not delete, and a bunch of other sessions later tried to create the node. These sessions got nodeexists failures I presume? > > > Forgive the block of text I'm going to write instead of code: > > I'm going to bet that the problem lies in PrepRequestProcessor. If we get the ephemerals for the session while an ephemeral is still in outstandingChanges and has not been committed, then another thread commits that ephemeral and removes it from outstanding changes before synchronizing in the outstandingChagnes block, we could never put it in the ephemeral set that we are using to reflect ephemerals to delete. I think we need to move the synchronized block up before we get the ephemerals from the database. But this is a bit of speculation at the moment. Can you create a JIRA tracker for me to look at this? > > Thanks, > C > > > > -----Original Message----- > From: kishore g [mailto:[EMAIL PROTECTED]] > Sent: Monday, September 26, 2011 1:07 AM > To: [EMAIL PROTECTED] > Subject: Re: ephemeral node not removed after the client session is long gone > > Hi, > > I got the following information from the logs. > > The node that still exists is > /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 > > I saw that the ephemeral owner is 86167322861045079 which is session id > 0x13220b93e610550. > > After searching in the transaction log of one of the ZK servers found that > session expired > > 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 > closeSession null > > On digging further into the logs I found that there were multiple sessions > created in quick succession and every session tried to create the same node. > But i verified that the sessions were closed and opened in order > 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 > createSession 6000 > 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 > closeSession null > 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 > createSession 6000 > 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid 0x601bd373a > closeSession null > 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e > createSession 6000 > 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid 0x601bd37a0 > closeSession null > 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 > createSession 6000 > 9/22/11 12:18:03 PM PDT session 0x13220b93e610553 cxid 0x74 zxid 0x601bd382b > closeSession null > 9/22/11 12:18:04 PM PDT session 0x13220b93e610554 cxid 0x0 zxid 0x601bd383c > createSession 6000 > 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x6a zxid 0x601bd388f > closeSession null > 9/22/11 12:18:06 PM PDT session 0x13220b93e610555 cxid 0x0 zxid 0x601bd3895 > createSession 6000 > 9/22/11 12:18:07 PM PDT session 0x13220b93e610555 cxid 0x6a zxid 0x601bd38cd > closeSession null > 9/22/11 12:18:10 PM PDT session 0x13220b93e610556 cxid 0x0 zxid 0x601bd38d1 > createSession 6000 > 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x0 zxid 0x601bd38f2 > createSession 6000 > 9/22/11 12:18:11 PM PDT session 0x13220b93e610557 cxid 0x51 zxid 0x601bd396a > closeSession null > > Here is the log output for the sessions that tried creating the same node > > 9/22/11 12:17:54 PM PDT session 0x13220b93e61054f cxid 0x42 zxid 0x601bd366b > create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7 > 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x42 zxid 0x601bd36ce > create > '/kafka-tracking/consumers/UserPerformanceEvent-<hostname>/owners/UserPerformanceEvent/529-7
-
Re: ephemeral node not removed after the client session is long goneCamille Fournier 2011-09-28, 00:42
This:
HashSet<String> es = zks.getZKDatabase() .getEphemerals(request.sessionId); Is outside the synchronized block that deletes the ephemerals: synchronized (zks.outstandingChanges) { for (ChangeRecord c : zks.outstandingChanges) { if (c.stat == null) { // Doing a delete es.remove(c.path); } else if (c.stat.getEphemeralOwner() == request.sessionId) { es.add(c.path); } } for (String path2Delete : es) { addChangeRecord(new ChangeRecord(txnHeader.getZxid(), path2Delete, null, 0, null)); } } On Tue, Sep 27, 2011 at 7:26 PM, Mahadev Konar <[EMAIL PROTECTED]> wrote: > Camille, > I am a little confused on the explanation/text. Probably might want > to update the jira (if kishore opens one) with a little more detail. > > thanks > mahadev > > On Tue, Sep 27, 2011 at 2:20 PM, Fournier, Camille F. > <[EMAIL PROTECTED]> wrote: >> So, the node was created by 0x13220b93e610550 at 12:17:56, then that session closed at 12:17:57, the node did not delete, and a bunch of other sessions later tried to create the node. These sessions got nodeexists failures I presume? >> >> >> Forgive the block of text I'm going to write instead of code: >> >> I'm going to bet that the problem lies in PrepRequestProcessor. If we get the ephemerals for the session while an ephemeral is still in outstandingChanges and has not been committed, then another thread commits that ephemeral and removes it from outstanding changes before synchronizing in the outstandingChagnes block, we could never put it in the ephemeral set that we are using to reflect ephemerals to delete. I think we need to move the synchronized block up before we get the ephemerals from the database. But this is a bit of speculation at the moment. Can you create a JIRA tracker for me to look at this? >> >> Thanks, >> C >> >> >> >> -----Original Message----- >> From: kishore g [mailto:[EMAIL PROTECTED]] >> Sent: Monday, September 26, 2011 1:07 AM >> To: [EMAIL PROTECTED] >> Subject: Re: ephemeral node not removed after the client session is long gone >> >> Hi, >> >> I got the following information from the logs. >> >> The node that still exists is >> /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 >> >> I saw that the ephemeral owner is 86167322861045079 which is session id >> 0x13220b93e610550. >> >> After searching in the transaction log of one of the ZK servers found that >> session expired >> >> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 >> closeSession null >> >> On digging further into the logs I found that there were multiple sessions >> created in quick succession and every session tried to create the same node. >> But i verified that the sessions were closed and opened in order >> 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 >> createSession 6000 >> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 >> closeSession null >> 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 >> createSession 6000 >> 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid 0x601bd373a >> closeSession null >> 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e >> createSession 6000 >> 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid 0x601bd37a0 >> closeSession null >> 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 >> createSession 6000 >> 9/22/11 12:18:03 PM PDT session 0x13220b93e610553 cxid 0x74 zxid 0x601bd382b >> closeSession null >> 9/22/11 12:18:04 PM PDT session 0x13220b93e610554 cxid 0x0 zxid 0x601bd383c >> createSession 6000 >> 9/22/11 12:18:05 PM PDT session 0x13220b93e610554 cxid 0x6a zxid 0x601bd388f >> closeSession null >> 9/22/11 12:18
-
Re: ephemeral node not removed after the client session is long gonekishore g 2011-09-28, 04:38
Here is the JIRA.
https://issues.apache.org/jira/browse/ZOOKEEPER-1208 On Tue, Sep 27, 2011 at 5:42 PM, Camille Fournier <[EMAIL PROTECTED]>wrote: > This: > > HashSet<String> es = zks.getZKDatabase() > .getEphemerals(request.sessionId); > > Is outside the synchronized block that deletes the ephemerals: > > synchronized (zks.outstandingChanges) { > for (ChangeRecord c : zks.outstandingChanges) { > if (c.stat == null) { // Doing a delete > es.remove(c.path); } > else if (c.stat.getEphemeralOwner() == request.sessionId) { > es.add(c.path); } > } for (String path2Delete : es) { > addChangeRecord(new ChangeRecord(txnHeader.getZxid(), > path2Delete, null, 0, null)); > } } > On Tue, Sep 27, 2011 at 7:26 PM, Mahadev Konar <[EMAIL PROTECTED]> > wrote: > > Camille, > > I am a little confused on the explanation/text. Probably might want > > to update the jira (if kishore opens one) with a little more detail. > > > > thanks > > mahadev > > > > On Tue, Sep 27, 2011 at 2:20 PM, Fournier, Camille F. > > <[EMAIL PROTECTED]> wrote: > >> So, the node was created by 0x13220b93e610550 at 12:17:56, then that > session closed at 12:17:57, the node did not delete, and a bunch of other > sessions later tried to create the node. These sessions got nodeexists > failures I presume? > >> > >> > >> Forgive the block of text I'm going to write instead of code: > >> > >> I'm going to bet that the problem lies in PrepRequestProcessor. If we > get the ephemerals for the session while an ephemeral is still in > outstandingChanges and has not been committed, then another thread commits > that ephemeral and removes it from outstanding changes before synchronizing > in the outstandingChagnes block, we could never put it in the ephemeral set > that we are using to reflect ephemerals to delete. I think we need to move > the synchronized block up before we get the ephemerals from the database. > But this is a bit of speculation at the moment. Can you create a JIRA > tracker for me to look at this? > >> > >> Thanks, > >> C > >> > >> > >> > >> -----Original Message----- > >> From: kishore g [mailto:[EMAIL PROTECTED]] > >> Sent: Monday, September 26, 2011 1:07 AM > >> To: [EMAIL PROTECTED] > >> Subject: Re: ephemeral node not removed after the client session is long > gone > >> > >> Hi, > >> > >> I got the following information from the logs. > >> > >> The node that still exists is > >> > /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 > >> > >> I saw that the ephemeral owner is 86167322861045079 which is session id > >> 0x13220b93e610550. > >> > >> After searching in the transaction log of one of the ZK servers found > that > >> session expired > >> > >> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid > 0x601bd36f7 > >> closeSession null > >> > >> On digging further into the logs I found that there were multiple > sessions > >> created in quick succession and every session tried to create the same > node. > >> But i verified that the sessions were closed and opened in order > >> 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid > 0x601bd36b5 > >> createSession 6000 > >> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid > 0x601bd36f7 > >> closeSession null > >> 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid > 0x601bd36f8 > >> createSession 6000 > >> 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid > 0x601bd373a > >> closeSession null > >> 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid > 0x601bd373e > >> createSession 6000 > >> 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid > 0x601bd37a0 > >> closeSession null > >> 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid > 0x601bd37e9
-
Re: ephemeral node not removed after the client session is long goneMahadev Konar 2011-09-29, 07:20
Camille,
well done! I think thats probably the bug. You are right. There is a race condition on the outstanding changes and the ephemeral nodes. Great catch! thanks mahadev On Sep 27, 2011, at 5:42 PM, Camille Fournier wrote: > This: > > HashSet<String> es = zks.getZKDatabase() > .getEphemerals(request.sessionId); > > Is outside the synchronized block that deletes the ephemerals: > > synchronized (zks.outstandingChanges) { > for (ChangeRecord c : zks.outstandingChanges) { > if (c.stat == null) { // Doing a delete > es.remove(c.path); } > else if (c.stat.getEphemeralOwner() == request.sessionId) { > es.add(c.path); } > } for (String path2Delete : es) { > addChangeRecord(new ChangeRecord(txnHeader.getZxid(), > path2Delete, null, 0, null)); > } } > On Tue, Sep 27, 2011 at 7:26 PM, Mahadev Konar <[EMAIL PROTECTED]> wrote: >> Camille, >> I am a little confused on the explanation/text. Probably might want >> to update the jira (if kishore opens one) with a little more detail. >> >> thanks >> mahadev >> >> On Tue, Sep 27, 2011 at 2:20 PM, Fournier, Camille F. >> <[EMAIL PROTECTED]> wrote: >>> So, the node was created by 0x13220b93e610550 at 12:17:56, then that session closed at 12:17:57, the node did not delete, and a bunch of other sessions later tried to create the node. These sessions got nodeexists failures I presume? >>> >>> >>> Forgive the block of text I'm going to write instead of code: >>> >>> I'm going to bet that the problem lies in PrepRequestProcessor. If we get the ephemerals for the session while an ephemeral is still in outstandingChanges and has not been committed, then another thread commits that ephemeral and removes it from outstanding changes before synchronizing in the outstandingChagnes block, we could never put it in the ephemeral set that we are using to reflect ephemerals to delete. I think we need to move the synchronized block up before we get the ephemerals from the database. But this is a bit of speculation at the moment. Can you create a JIRA tracker for me to look at this? >>> >>> Thanks, >>> C >>> >>> >>> >>> -----Original Message----- >>> From: kishore g [mailto:[EMAIL PROTECTED]] >>> Sent: Monday, September 26, 2011 1:07 AM >>> To: [EMAIL PROTECTED] >>> Subject: Re: ephemeral node not removed after the client session is long gone >>> >>> Hi, >>> >>> I got the following information from the logs. >>> >>> The node that still exists is >>> /kafka-tracking/consumers/UserPerformanceEvent-<host>/owners/UserPerformanceEvent/529-7 >>> >>> I saw that the ephemeral owner is 86167322861045079 which is session id >>> 0x13220b93e610550. >>> >>> After searching in the transaction log of one of the ZK servers found that >>> session expired >>> >>> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 >>> closeSession null >>> >>> On digging further into the logs I found that there were multiple sessions >>> created in quick succession and every session tried to create the same node. >>> But i verified that the sessions were closed and opened in order >>> 9/22/11 12:17:56 PM PDT session 0x13220b93e610550 cxid 0x0 zxid 0x601bd36b5 >>> createSession 6000 >>> 9/22/11 12:17:57 PM PDT session 0x13220b93e610550 cxid 0x74 zxid 0x601bd36f7 >>> closeSession null >>> 9/22/11 12:17:58 PM PDT session 0x13220b93e610551 cxid 0x0 zxid 0x601bd36f8 >>> createSession 6000 >>> 9/22/11 12:17:59 PM PDT session 0x13220b93e610551 cxid 0x74 zxid 0x601bd373a >>> closeSession null >>> 9/22/11 12:18:00 PM PDT session 0x13220b93e610552 cxid 0x0 zxid 0x601bd373e >>> createSession 6000 >>> 9/22/11 12:18:01 PM PDT session 0x13220b93e610552 cxid 0x6c zxid 0x601bd37a0 >>> closeSession null >>> 9/22/11 12:18:02 PM PDT session 0x13220b93e610553 cxid 0x0 zxid 0x601bd37e9 |