-Re: Zookeeper 3.3.4 guarantees across sessions from the same client ?
Neha Narkhede 2013-08-01, 21:06
Thanks a bunch for checking on that and confirming the behavior.
I looked at the code in PrepRequestProcessor where zookeeper throws
NodeExists. It only checks that the ephemeral node exists, but doesn't
check if the ephemeral owner session is marked as "expired". I was
wondering if it is a feasible solution to detect that the owner is a
session that is getting expired, so instead of returning NodeExists, let
the write go through and just replace the ephemeral owner with the new
session id. However, this may have other side-effects that I'm not aware
of. Curious to know what you think of this fix.
In any case, we can handle it in Kafka by writing the timestamp as part of
the ephemeral path. On NodeExists, we can read the path back and retry the
operation if the timestamp doesn't match.
On Thu, Aug 1, 2013 at 2:07 AM, FPJ <[EMAIL PROTECTED]> wrote:
> Hi Neha,
> Unless we expire a session and delete ephemerals atomically, there are only
> two options I see:
> 1- Delete right before expiring the session
> 2- Delete right after expiring the session
> Because of timing, we can have the following. With the first, a client
> observe the delete before the session actually expires, which violates our
> contract. With the second, you may observe an ephemeral znode after the
> session has expired as you have. I would say that the second option is
> correct as long as the ephemerals are eventually deleted, but it does have
> the side-effect you're mentioning.
> > -----Original Message-----
> > From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> > Sent: 01 August 2013 02:42
> > To: [EMAIL PROTECTED]
> > Subject: Zookeeper 3.3.4 guarantees across sessions from the same client
> > The behavior we saw on one of our zookeeper clients is as follows. The
> > session expires on the client, it assumes the ephemeral nodes are
> > so it establishes a new session with zookeeper and tries to re-create the
> > ephemeral nodes. However, when it tries to re-create the ephemeral node,
> > zookeeper throws back a NodeExists error code. Now this is legitimate
> > a session disconnect event (since zkclient automatically retries the
> > and raises a NodeExists error). Also by design, Kafka doesn't have
> > clients create the same ephemeral node, so Kafka server assumes the
> > NodeExists is normal. However, after a few seconds zookeeper deletes that
> > ephemeral node. So from the client's perspective, even though the client
> > a new valid session, its ephemeral node is gone.
> > After poking at the transaction and log4j logs, I saw that the NodeExists
> > because the zookeeper leader had retained the ephemeral node from the
> > previous expired session. It turns out that it notified the client of the
> > expiration before actually deleting the ephemeral node. It is also worth
> > noting that the previous session was expired due to a long fsync
> > on the zookeeper leader. After it returned from the fsync, it had a whole
> > bunch of sessions to expire.
> > In this case, it seems that zookeeper should not notify the client that
> > session is expired until the ephemeral node information is actually gone.
> > Or maybe I'm not clear on what the guarantees from zookeeper are, across
> > sessions from the same client.
> > Thanks,
> > Neha