Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper, mail # dev - zookeeper_interest returning ZOK on Connection Loss


+
Yunong Xiao 2012-11-04, 22:11
Copy link to this message
-
Re: zookeeper_interest returning ZOK on Connection Loss
Michi Mutsuzaki 2012-11-05, 18:33
Hi Yunong,

Yes, this looks like a bug. The problem is that the C client is not
handling the case when connect() returns EINPROGRESS or EWOULDBLOCK
and eventually fails. I think the right fix is to check SO_ERROR after
the socket becomes writable. Please go ahead and open a jira.

Thanks!
--Michi

On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao <[EMAIL PROTECTED]> wrote:
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
>
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
>
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
>
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
>
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
>
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's still
> unable to connect after session_timeout has been exceeded.
>
> Is this the right interpretation of the API? Are you guys open to taking the
> patch I described?
>
> -Yunong
+
Yunong Xiao 2012-11-05, 21:19
+
Michi Mutsuzaki 2012-11-05, 21:59
+
Yunong Xiao 2012-11-05, 22:16
+
Michi Mutsuzaki 2012-11-06, 07:24
+
Yunong Xiao 2012-11-14, 01:08
+
Anthony Barré 2013-03-27, 09:41