I'm like two days away from having a version of ZooKeeper with an immutable
DataTree on the server. This means that for any modification of the DataTree,
all Nodes from the modified one up to the root are replaced by new nodes.
- No synchronisation needed when accessing the DataTree.
- The snapshotter thread gets an immutable datatree and will write a
consistent DataTree to the disk.
- No headaches whether multi transactions could lead to issues with
- Much better testability.
- No concurrency - No headaches.
- I hope for considerable speed improvements. Maybe also some memory savings,
at least from refactorings possible after this step.
Possible further improvements:
Read requests actually don't need to enter the processor pipeline. Instead
each server connection could get a reference to a (zxid, tree) tuple. Updates
are delivered as (zxid, newTree, triggerWatchesCallback) to the server
The watches could be managed at each server connection instead of centraly at
I'm pre-announcing this now to give an explanaition of my intends with ZK.
I've created a couple of patches that refactor ZK in small incrementally and
individually tested steps as far as possible towards the immutable data tree.
The Branch is here:
I'm working on the final step now, that really makes DataNode immutable and
which is the only larger change. I don't see yet how I can split this in
Of course it's totaly up to you, whether you'd like to take this road.
Once I've got there, I'd like to do performance comparissions between old and
new. However I don't have accesses to hardware for such a test. So this would
only be possible, if somebody would like to sponsor me with access to a
cluster or some machine hours at amazon.
Thomas Koch, http://www.koch.ro