Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Consistency in zookeeper

Copy link to this message
Re: Consistency in zookeeper
Let me add a couple points to this thread. Yasin didn't ask about a concrete use case, it sounds more like an exploration question rather than a question about how to solve a particular problem. If there is a use case behind the question, it would be great to hear about it.

One reason we had to serve read requests locally comes from the assumption that zookeeper traffic is dominated by reads. By processing read requests locally, we can increase throughput capacity by adding more servers.

The consistency guarantee that zookeeper provides is not eventual in the sense I'm used to: replicas can diverge but they eventually converge. ZK replica servers don't diverge but they can be arbitrarily behind on the application of updates that have been decided upon. We can control to some extent how far behind a follower can be by changing syncLimit.


On Mar 1, 2013, at 7:19 PM, Alexander Shraer <[EMAIL PROTECTED]> wrote:

> its possible, but what it gets you is that the read will see at least
> the writes that completed before the sync started.
> possibly later writes too. Actually, this is true only with some
> timing assumption. As was previously discussed on the
> list, in order to really guarantee this property even with leader
> failures, the leader would have to broadcast sync commands just like
> updates,
> which it currently doesn't do for some reason.
> Alex
> On Fri, Mar 1, 2013 at 9:49 AM, kishore g <[EMAIL PROTECTED]> wrote:
>> Will sync and read really help to achieve what  Yasin wants ? is it not
>> possible for value to change between sync and read?
>> Thanks
>> Kishore G
>> On Thu, Feb 28, 2013 at 9:32 PM, Rakesh R <[EMAIL PROTECTED]> wrote:
>>> Hi Yasin,
>>> Adding one more point,
>>> ZooKeeper provides different ways of achieving data sync. Like Alex &
>>> Vladimir explained, sync() api is one way and it has the overhead of
>>> performance.
>>> Another approach is to define Watchers. This also will be helpful to keep
>>> in sync the data between the clients. Its internally using the asynchronous
>>> way of notifying different events. Also, its very light-weight and here
>>> user/client should define specific watchers to achieve the synchronized
>>> view of data.
>>> ZK supports various events like NodeDataChanged, NodeChildrenChanged.
>>> Since it is asynchronous, there will be slight latency in recieving the
>>> events.
>>> Reference:
>>> http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkWatches
>>> Section: •The data for which the watch was set
>>> http://zookeeper.apache.org/doc/r3.2.2/zookeeperTutorial.html#sc_producerConsumerQueues
>>> -Rakesh
>>> ________________________________________
>>> From: Alexander Shraer [[EMAIL PROTECTED]]
>>> Sent: Friday, March 01, 2013 5:19 AM
>>> Subject: Re: Consistency in zookeeper
>>> Hi Yasin,
>>> I assume you mean "linearizability" by "strict consistency".
>>> ZooKeeper provides "sequential consistency". This is weaker than
>>> linearizability but is still very strong, much stronger than "eventual
>>> consistency".
>>> In addition, all update operations are linearizable as they are
>>> sequenced by the leader. With sequential consistency, a reader never
>>> "goes back in time"
>>> even if you read from a different follower every time, you'll never
>>> see version 3 of the data after seeing version 4.
>>> ZooKeeper also provides a sync command. If you invoke a sync command
>>> and then a read, the read is guaranteed to see at least the last write
>>> that
>>> completed before the sync started. So if you always do "sync + read"
>>> instead of just "read", you get linearizability. But you pay in
>>> performance since
>>> these reads will no longer be executed locally on the follower to
>>> which you're connected - they sync is sent to the leader. That's why
>>> ZooKeeper gives
>>> you the option of doing a fast read that is consistent but may