-Re: Does Zookeeper Lock Entire Znode Tree During a Write
Henry Robinson 2012-05-15, 16:23
On 14 May 2012 21:15, dmly <[EMAIL PROTECTED]> wrote:
> Does ZK lock the entire znode tree during a write? Or does ZK just locks
> top most znode that a client is connecting to?
> For example:
> When I connect to "/doug" and create the "doug/lock-001" node and do and
> update, is "/" locked or just "/doug"?
The short answer is no, ZK does not lock the whole tree. If you look at
ZKDatabase.java and DataTree.java you can see in processTxn etc. that the
only the node being written to (the parent node in the case of a create
transaction) is 'locked' in the Java sense.
However the reason you're probably asking is that you're wondering about
concurrency of operations, and whether a write potentially blocks another
write from succeeding. This makes sense from a traditional database
perspective where transactions are potentially long-running, and so
fine-grained locking is needed to allow concurrent access to disjoint parts
of a single table, for example.
In ZooKeeper, all write operations are serialised - not just in the
logical, equivalent-to-a-sequential-history sense, but in the sense that
they are all executed one after the other. So in that sense a write 'locks'
the whole tree, because until it's completed in memory, any subsequent
write to memory won't take place.
That's not to say that ZK doesn't provide some concurrency though.
Transactions are multi-stage operations, and it's possible to pipeline
these stages so that e.g. disk and CPU can be fully used at the same time.
For example, another, earlier, stage of a write operation is logging the
request to disk, for fault-tolerance purposes. It is possible for a later
write to log itself while an earlier write is happening in memory. ZK takes
advantage of this pipelining (see the RequestProcessor classes) so you can
have several transactions 'in flight' at the same time, but they are all
issued and processed in strict sequential order.
> View this message in context:
> Sent from the zookeeper-user mailing list archive at Nabble.com.