Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Linking two sites via two Zookeeper instances


Copy link to this message
-
Re: Linking two sites via two Zookeeper instances
Christian Schuhegger 2013-02-02, 03:39
Hi Alexander,

Alexander Shraer wrote:
> I don't think this is currently possible. I believe there has been some
> work on building a hierarchy of ZooKeeper clusters @ Facebook, but I don't
> know the details. I don't believe that this would mean less management
> overhead though, since you'd still need several voting servers in each
> datacenter.

ok, I understand.

> But I actually wanted to ask you about your usecase. Do you have
> consistency requirements among data items mastered in different datacenters
> ? For example - do you require that all clients (no matter where they are)
> see changes to /A/* and /B/* in the same order ? could you share some more
> details ? or, lets say you have 3 datacenters, one mastering /A/* another
> /B/* and the third /C/*. Suppose that the first datacenter sees a change to
> /C/x and afterwards /A/y is updated. Is it possible that someone in
> datacenter B sees the new /A/y  before the new /C/x  ?

The two things that you might need in a distributed set-up are agreement
and/or order. Agreement would mean that all participants in the
distributed set-up get ALL updates and order would mean that they get
all updates in the same sequential order.

Zookeeper is implementing both.

For several of my use cases agreement and order would be required within
one data center, because we simply structure (shard) our services and
user groups in such a way that the users that need both, agreement and
order, access services within one data center. Across data centers I
only would need agreement. I would be nice to have agreement and order
across data centers, but because of latency requirements I guess this
would be prohibitively expensive.

Now to your question: yes, it would be fine if client would see C/x and
A/y in different order.

> The reason I'm asking is that some time in the past me and others made this
> initial proposal:
> http://wiki.apache.org/hadoop/ZooKeeper/MountRemoteZookeeper
> which didn't get enough support for lack of a compelling use-case (among
> other things).

It would be nice if Zookeeper would offer agreement and order in a more
granular fashion. I could imagine that write throughput could benefit
even within one data center if you have a use case that only needs
agreement, but you also pay for order, e.g. you thread all writes
through a single writer.

Thanks for your thoughts!
--
Christian Schuhegger