Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Zookeeper protocol weirdness and pure Python kazoo client


Copy link to this message
-
Re: Zookeeper protocol weirdness and pure Python kazoo client
Ben Bangert 2012-08-31, 00:06
On Aug 30, 2012, at 4:50 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:

> I don't think there is an official spec beyond the zookeeper.jute
> file. It would be very helpful if you can share what you have found
> implementing your python client.

The most interesting thing I found was that documented C/Java behavior isn't actually enforced beyond the client boundaries. For some reason I had expected the state transitions listed to be some limitation of the Zookeeper protocol itself. So it has had me curious about ways the client implementation itself could make using Zookeeper less error-prone, as Curator and Kazoo respectively try to make working with Zookeeper less error-prone in their languages. This is part of why I was baffled when an AUTH_FAILED resulted in connection termination, but not session expiration.

In the case of kazoo, its now substantially easier to debug what is actually happening since its always within the bounds of Python. Reinstalling debug builds of the C lib and Python C binding got hairy, and still obfuscated a lot. This made debugging the prior password mangling issue in the Python lib a major pain.

I've found that the C lib seems like a bit of a second-class citizen for Zookeeper... the read-only feature has had patches submitted since 2010, which still aren't applied. The Python C binding also has multiple patches... not applied. We applied several of these to fix memory leaks and the password mangling to our zc-zookeeper-static Python lib, but after looking through all the remaining C lib bugs/patches and Python C bugs/patches, its way more than we want to deal with. C isn't my forte, and I'd much much rather debug Java code or read the Java client if I'm curious about how something is done, rather than deal with 2 layers of C code in addition to TCP and the Java server.

Meanwhile, I can implement the new Zookeeper 3.4 methods after looking at the jute code in just minutes, and debugging a problem is trivial when its all Python code. Some people have mentioned that without the C os thread used by the C binding, its possible in heavy Python threading thrashes that a ping might not be sent... which is true, but the session timeout can be increased and thats a very small price to pay given the other things noted here with the C lib/binding.

And of course, for other runtimes (Pypy) that can't run C extensions, the pure Python kazoo will now work.

> I think this is expected. ZooKeeper should not expire a session
> because of authentication failure. That would make it easier for a
> malicious client to expire random sessions. I don't know if there is a
> technical reason for dropping the connection though. Maybe it was an
> arbitrary decision?

Yea, I wasn't terribly sure. The doc state diagram seems to indicate that an AUTH_FAILED should be treated the same as a SESSION_EXPIRATION or CLOSING event. However, your session is dead in the latter two cases, while the session is *not* dead in AUTH_FAILED, yet you end up in the same state. I would not be surprised if a substantial amount of code assumed that the session was dead when an AUTH_FAILED occurred, yet the session is not dead at all.

Cheers,
Ben