Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Distributed STM with Zookeeper 3.4.x possible?

Copy link to this message
Distributed STM with Zookeeper 3.4.x possible?
Hi folks,

It seems to me that with the multi-update operation in Zookeeper 3.4.x, it
should be possible to do a form of distributed STM?

The idea I have is this...

- For every zNode that I'll set or delete, I perform a get first, s.t. I
have its last version number.
 - This means when I commit the transaction, I can add the version number
to the set and delete ops. The transaction would fail if any of the nodes
have been changed after the get operation.
- If I assume all other operations I do are also atomic transactions, then
create should also safe:
 - Assume my commit fails because of the create, then it's because another
atomic transaction had created the node before me
 - Assume my create succeeds, then it's either because nothing happened
during my transaction or other atomic transactions have created and then
deleted the node

So a typical transaction would look like this:

- Get all involved nodes for their version numbers
- Queue up create, set, delete operations into a multi-op array
- Commit the multi-update atomically

Problem is.. I have no way to read the values atomically.

If I simply issue a sequence of get() and get_children() requests - then
it's possible for a atomic commit to happen between my get requests, and
thus it's possible for me to get inconsistent states.

The solution I have in mind is to issue the get() and get_children()
requests with watches - if I don't see a watch notification between the
first get and the final get, then the data I get should correspond to a
single transaction (the transaction may be one that happened after the
first get, however - but that's still correct). If I got a watch
notification during my get requests, then I know my data is inconsistent
and I should try again. This should work but it seems quite clumsy to me.

Any ideas?

Best Regards,
Martin Kou