Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> retreat from zookeeper

Copy link to this message
Re: retreat from zookeeper
Thomas, Thanks for you comments, it is an intriguing list of points  
you lay down below and in some sense it highlights the fact that there  
is still work to be done. I find it sad, though, that you decided to  
frame it in such a destructive way, picturing ZooKeeper as a proof-of-
concept, poorly designed system. I certainly don't share your view,  
and the fact that people use it and invest on it makes me think that  
it is not as bad as you put it. There are other (and possibly better)  
ways of implementing the same or similar functionality, and it is  
great to hear that you have good ideas for how to do it. If you are  
able to develop such a system and form a community around it, then I'd  
certainly consider contributing to it.


On May 19, 2011, at 11:21 AM, Thomas Koch wrote:

> Hi,
> as you may have noticed, I haven't been active in the ZooKeeper  
> project
> anymore for a couple of months. I'm a full time student again since  
> march so
> that any further activity in Hadoop/ZooKeeper would need to be auto-
> motivated.
> Since I don't want to just fade away and I'll still give a talk about
> ZooKeeper on the BerlinBuzzWords conf (Berlin, june 6/7), I listed  
> the reasons
> why I wouldn't like to work on the current ZooKeeper code base  
> anymore.
> I plan the following structure for my talk:
> 1) theoretical model / protocol of ZooKeeper
> 2) practical applications, projects using ZooKeeper
> 3) shortcomings of the current ZooKeeper code base
> A tentative brain dump of part three is listed below. I appreciate any
> comments that could help me to give a balanced presentation of the  
> ZooKeeper
> project.
> If I'd need a ZooKeeper implementation right now I'd probably do a  
> minimal-
> feature rewrite in Scala + Akka. I do appreciate ZooKeeper as an  
> invaluable
> proof-of-concept implementation and pioneer. But as in american  
> history there
> should come others after the pioneers that don't look like Clint  
> Eastwood
> anymore and build more tidy things.
> The list:
> * The code is tightly coupled
> * most so called "Unit-Tests" are actualy integration tests. They  
> run the
> whole application and test one specific functionality.
> * no uniform configuration: command line parameters, system  
> properties,
> configuration file (java properties)
> * configuration properties copied to static class members
> * feature bloat on fragile foundation: e.g. chroot + automatic  
> resubscribtion
> does not work
> * implementation unlike specification: allowed characters in path
> * still on ant instead of maven (depends how you see ant vs. maven)
> * circular object dependencies (e.g. ZooKeeper <-> ClientCnxn)
> * methods with +100 lines of code and nested conditions depth well  
> over 5
> * general attitude against refactoring, no knowledge or appreciation  
> of
> "effective java" (Josh Bloch) or "clean code" (Robert C. Martin)
> * magic numbers instead of enum
> * still bound to inline copy of jute (HadoopIO, avro predecessor)
> * even hand coded (de)serialization in leader election
> * no client-only jar. Every client gets the full server code.
> * unhandy API triggered (at least) two client API wrappers:  
> zkClient, cages
> * insane amounts of code duplication
> * horrible, fragile thread programming: plenty of "XYZ extends  
> Threads"
> instead of
>  - implements runnable
>  - or better: executor framework
>  - or much better: actors (see Akka)
>  -> leads to fear of refactoring, because nobody understands all
> synchronization needs.
> Best regards,
> Thomas Koch, http://www.koch.ro


research scientist

direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301