Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Zookeeper protocol weirdness and pure Python kazoo client


Copy link to this message
-
Re: Zookeeper protocol weirdness and pure Python kazoo client
On 30 August 2012 17:06, Ben Bangert <[EMAIL PROTECTED]> wrote:

> On Aug 30, 2012, at 4:50 PM, Michi Mutsuzaki <[EMAIL PROTECTED]>
> wrote:
>
> > I don't think there is an official spec beyond the zookeeper.jute
> > file. It would be very helpful if you can share what you have found
> > implementing your python client.
>
> The most interesting thing I found was that documented C/Java behavior
> isn't actually enforced beyond the client boundaries. For some reason I had
> expected the state transitions listed to be some limitation of the
> Zookeeper protocol itself. So it has had me curious about ways the client
> implementation itself could make using Zookeeper less error-prone, as
> Curator and Kazoo respectively try to make working with Zookeeper less
> error-prone in their languages. This is part of why I was baffled when an
> AUTH_FAILED resulted in connection termination, but not session expiration.
>
> In the case of kazoo, its now substantially easier to debug what is
> actually happening since its always within the bounds of Python.
> Reinstalling debug builds of the C lib and Python C binding got hairy, and
> still obfuscated a lot. This made debugging the prior password mangling
> issue in the Python lib a major pain.
>
> I've found that the C lib seems like a bit of a second-class citizen for
> Zookeeper... the read-only feature has had patches submitted since 2010,
> which still aren't applied. The Python C binding also has multiple
> patches... not applied. We applied several of these to fix memory leaks and
> the password mangling to our zc-zookeeper-static Python lib, but after
> looking through all the remaining C lib bugs/patches and Python C
> bugs/patches, its way more than we want to deal with. C isn't my forte, and
> I'd much much rather debug Java code or read the Java client if I'm curious
> about how something is done, rather than deal with 2 layers of C code in
> addition to TCP and the Java server.
>
> Meanwhile, I can implement the new Zookeeper 3.4 methods after looking at
> the jute code in just minutes, and debugging a problem is trivial when its
> all Python code. Some people have mentioned that without the C os thread
> used by the C binding, its possible in heavy Python threading thrashes that
> a ping might not be sent... which is true, but the session timeout can be
> increased and thats a very small price to pay given the other things noted
> here with the C lib/binding.
>
> And of course, for other runtimes (Pypy) that can't run C extensions, the
> pure Python kazoo will now work.
>

FWIW, this is my only reservation about a pure Python client - there isn't
a spec, and three separate implementations that might have subtly different
behaviours can be a nightmare to maintain. Ben - if you're able to turn any
of your efforts towards documenting your observations about how the
protocol actually works, that would be awesome.

And as regards the unapplied Python patches - that's my bad, I should be
committing them much more often. Can you give me a list of those you've
found useful, and in return for your excellent work I'll get them committed
as soon as I can?
>
> > I think this is expected. ZooKeeper should not expire a session
> > because of authentication failure. That would make it easier for a
> > malicious client to expire random sessions. I don't know if there is a
> > technical reason for dropping the connection though. Maybe it was an
> > arbitrary decision?
>
> Yea, I wasn't terribly sure. The doc state diagram seems to indicate that
> an AUTH_FAILED should be treated the same as a SESSION_EXPIRATION or
> CLOSING event. However, your session is dead in the latter two cases, while
> the session is *not* dead in AUTH_FAILED, yet you end up in the same state.
> I would not be surprised if a substantial amount of code assumed that the
> session was dead when an AUTH_FAILED occurred, yet the session is not dead
> at all.
>
> Cheers,
> Ben
--
Henry Robinson
Software Engineer
Cloudera
415-994-6679
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB