Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Linking two sites via two Zookeeper instances

Copy link to this message
Re: Linking two sites via two Zookeeper instances
Alexander Shraer 2013-01-28, 19:48
Hi Christian,

I don't think this is currently possible. I believe there has been some
work on building a hierarchy of ZooKeeper clusters @ Facebook, but I don't
know the details. I don't believe that this would mean less management
overhead though, since you'd still need several voting servers in each

But I actually wanted to ask you about your usecase. Do you have
consistency requirements among data items mastered in different datacenters
? For example - do you require that all clients (no matter where they are)
see changes to /A/* and /B/* in the same order ? could you share some more
details ? or, lets say you have 3 datacenters, one mastering /A/* another
/B/* and the third /C/*. Suppose that the first datacenter sees a change to
/C/x and afterwards /A/y is updated. Is it possible that someone in
datacenter B sees the new /A/y  before the new /C/x  ?

The reason I'm asking is that some time in the past me and others made this
initial proposal:
which didn't get enough support for lack of a compelling use-case (among
other things).


On Sun, Jan 27, 2013 at 5:56 AM, Christian Schuhegger <

> Hello,
> up to now I did not work with Zookeeper itself and am only reading
> documentation. From the document "ZooKeeper: Wait-free coordination for
> Internet-scale systems" I understand that ZooKeeper uses a single writer
> (the leader) approach for a ZooKeeper cluster, e.g. all writes go through
> the leader.
> From the documentation about Observers:
> http://zookeeper.apache.org/**doc/trunk/zookeeperObservers.**html<http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html>
> I understand that "Observers have other advantages. Because they do not
> vote, they are not a critical part of the ZooKeeper ensemble. Therefore
> they can fail, or be disconnected from the cluster, without harming the
> availability of the ZooKeeper service. The benefit to the user is that
> Observers may connect over less reliable network links than Followers. In
> fact, Observers may be used to talk to a ZooKeeper server from another data
> center."
> The use-case I have in mind is to use ZooKeeper within one data-center and
> synchronize data via the Observer mechanism to another data center half way
> around the world. I would need to do the same symmetrically the other way
> round.
> I could imagine to do this by setting up two independent ZooKeeper
> clusters, one which has the leader and the other voters in data center A
> and its Observers in data center B and another ZooKeeper cluster which has
> the leader and the other voters in data center B and its Observers in data
> center A.
> This would mean an increased maintenance overhead, I believe.
> My question is now if it is somehow possible to do this with one ZooKeeper
> cluster only by configuration, e.g. defining that all writes that go to
> znode /A and its children (e.g. /A/a, /A/b, ...) are handled by a group of
> voters in data center A and all writes that go to znode /B and its children
> are handled by a group of voters in data center B. All ZooKeeper servers
> would be at least observers of all znodes in the cluster, e.g. the group of
> ZooKeeper servers that is not voter for a given top-level node would at
> least be observer of that top-level node.
> I would be interested to extend this concept to more than two data centers
> with ping times between the data centers in the order of 300ms.
> Many thanks and best regards,
> --
> Christian Schuhegger