-Re: Zookeeper protocol weirdness and pure Python kazoo client
Henry Robinson 2012-08-31, 00:10
On 30 August 2012 17:06, Ben Bangert <[EMAIL PROTECTED]> wrote:
> On Aug 30, 2012, at 4:50 PM, Michi Mutsuzaki <[EMAIL PROTECTED]>
> > I don't think there is an official spec beyond the zookeeper.jute
> > file. It would be very helpful if you can share what you have found
> > implementing your python client.
> The most interesting thing I found was that documented C/Java behavior
> isn't actually enforced beyond the client boundaries. For some reason I had
> expected the state transitions listed to be some limitation of the
> Zookeeper protocol itself. So it has had me curious about ways the client
> implementation itself could make using Zookeeper less error-prone, as
> Curator and Kazoo respectively try to make working with Zookeeper less
> error-prone in their languages. This is part of why I was baffled when an
> AUTH_FAILED resulted in connection termination, but not session expiration.
> In the case of kazoo, its now substantially easier to debug what is
> actually happening since its always within the bounds of Python.
> Reinstalling debug builds of the C lib and Python C binding got hairy, and
> still obfuscated a lot. This made debugging the prior password mangling
> issue in the Python lib a major pain.
> I've found that the C lib seems like a bit of a second-class citizen for
> Zookeeper... the read-only feature has had patches submitted since 2010,
> which still aren't applied. The Python C binding also has multiple
> patches... not applied. We applied several of these to fix memory leaks and
> the password mangling to our zc-zookeeper-static Python lib, but after
> looking through all the remaining C lib bugs/patches and Python C
> bugs/patches, its way more than we want to deal with. C isn't my forte, and
> I'd much much rather debug Java code or read the Java client if I'm curious
> about how something is done, rather than deal with 2 layers of C code in
> addition to TCP and the Java server.
> Meanwhile, I can implement the new Zookeeper 3.4 methods after looking at
> the jute code in just minutes, and debugging a problem is trivial when its
> all Python code. Some people have mentioned that without the C os thread
> used by the C binding, its possible in heavy Python threading thrashes that
> a ping might not be sent... which is true, but the session timeout can be
> increased and thats a very small price to pay given the other things noted
> here with the C lib/binding.
> And of course, for other runtimes (Pypy) that can't run C extensions, the
> pure Python kazoo will now work.
FWIW, this is my only reservation about a pure Python client - there isn't
a spec, and three separate implementations that might have subtly different
behaviours can be a nightmare to maintain. Ben - if you're able to turn any
of your efforts towards documenting your observations about how the
protocol actually works, that would be awesome.
And as regards the unapplied Python patches - that's my bad, I should be
committing them much more often. Can you give me a list of those you've
found useful, and in return for your excellent work I'll get them committed
as soon as I can?
> > I think this is expected. ZooKeeper should not expire a session
> > because of authentication failure. That would make it easier for a
> > malicious client to expire random sessions. I don't know if there is a
> > technical reason for dropping the connection though. Maybe it was an
> > arbitrary decision?
> Yea, I wasn't terribly sure. The doc state diagram seems to indicate that
> an AUTH_FAILED should be treated the same as a SESSION_EXPIRATION or
> CLOSING event. However, your session is dead in the latter two cases, while
> the session is *not* dead in AUTH_FAILED, yet you end up in the same state.
> I would not be surprised if a substantial amount of code assumed that the
> session was dead when an AUTH_FAILED occurred, yet the session is not dead
> at all.