up to now I did not work with Zookeeper itself and am only reading
documentation. From the document "ZooKeeper: Wait-free coordination for
Internet-scale systems" I understand that ZooKeeper uses a single writer
(the leader) approach for a ZooKeeper cluster, e.g. all writes go
through the leader.
From the documentation about Observers:
I understand that "Observers have other advantages. Because they do not
vote, they are not a critical part of the ZooKeeper ensemble. Therefore
they can fail, or be disconnected from the cluster, without harming the
availability of the ZooKeeper service. The benefit to the user is that
Observers may connect over less reliable network links than Followers.
In fact, Observers may be used to talk to a ZooKeeper server from
another data center."
The use-case I have in mind is to use ZooKeeper within one data-center
and synchronize data via the Observer mechanism to another data center
half way around the world. I would need to do the same symmetrically the
other way round.
I could imagine to do this by setting up two independent ZooKeeper
clusters, one which has the leader and the other voters in data center A
and its Observers in data center B and another ZooKeeper cluster which
has the leader and the other voters in data center B and its Observers
in data center A.
This would mean an increased maintenance overhead, I believe.
My question is now if it is somehow possible to do this with one
ZooKeeper cluster only by configuration, e.g. defining that all writes
that go to znode /A and its children (e.g. /A/a, /A/b, ...) are handled
by a group of voters in data center A and all writes that go to znode /B
and its children are handled by a group of voters in data center B. All
ZooKeeper servers would be at least observers of all znodes in the
cluster, e.g. the group of ZooKeeper servers that is not voter for a
given top-level node would at least be observer of that top-level node.
I would be interested to extend this concept to more than two data
centers with ping times between the data centers in the order of 300ms.
Many thanks and best regards,
Jordan Zimmerman 2013-01-28, 20:09
Alexander Shraer 2013-01-28, 19:48
Christian Schuhegger 2013-02-02, 03:57
Christian Schuhegger 2013-02-02, 03:39