Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # dev >> ZK to coordinate across data centers

Narayanan A R 2012-05-08, 07:21
Camille Fournier 2012-05-08, 13:42
Narayanan A R 2012-05-09, 03:27
Camille Fournier 2012-05-09, 14:16
Narayanan A R 2012-05-09, 20:37
Copy link to this message
Re: ZK to coordinate across data centers
You can't do ZK guaranteed live at all times across an even number of
data centers. If you want to guarantee quorum even if you lose a
datacenter you need an odd number of datacenters for your quorum, in
your case, that would be 3.
I don't have load numbers available to share unfortunately, and really
ZK load across DCs depends quite a bit on the hardware setup and
network, but I suspect that you will be totally fine. 1000 locks a
minute is not very high load and 10 clients is pretty minimal,


On Wed, May 9, 2012 at 4:37 PM, Narayanan A R
> It is between the data centers. So the BCP requirement is to keep offering
> locks reliably for all data centers (about 2 to 4 data centers) even if the
> network connectivity between the data centers goes down or servers die in
> one DC.
> The load is spiky and not constant. The worst case peak could be about 1000
> locks every minute or so. There will be about 10 clients in total. The ping
> time between data centers will be in the order of milli seconds.
> Could you share you numbers if that's ok?
> I believe there will be 3 servers per data center and one leader and its
> locality depends on who wins the election and all the write requests goes
> to that leader. So potentially all write requests travel across data
> centers to get to the leader and then the replication data is spread out to
> all followers as well in all the data centers.
> On Wed, May 9, 2012 at 7:16 AM, Camille Fournier <[EMAIL PROTECTED]> wrote:
>> What's your BCP requirements? Do you need to span clusters because you
>> need continued availability if one cluster goes down? What write
>> throughput do you expect to need, how many clients do you anticipate
>> serving, how many locks will they need? Write throughput does go down
>> when you span clusters, but it's not as bad as you might think, unless
>> your ping time between clusters is very slow. I supported
>> cross-datacenter clusters doing quite respectable write throughput
>> (sorry, don't have any numbers handy but it was much more capacity
>> than my service needed), so I wouldn't overdesign your system before
>> checking the throughput you could get using a simple setup.
>> C
>> On Tue, May 8, 2012 at 11:27 PM, Narayanan A R
>> <[EMAIL PROTECTED]> wrote:
>> > Imagine the locks recipe need to be used to synchronize resources across
>> > data centers. One option is to span the ensemble to all the data centers.
>> > But I am afraid this will significantly reduce the write throughout. The
>> > alternative is to setup ZK in one and have all the clients talk to the
>> same
>> > cluster. Even with this approach the clients needs to keep the connection
>> > open to a different data center. What I have in mind is to make the
>> > requests stateless and have a service offer locks.
>> >
>> > On Tue, May 8, 2012 at 6:42 AM, Camille Fournier <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> It can, but it depends on what you're doing. If you want to give us
>> >> some more information on your proposed use case we can maybe help you
>> >> more.
>> >>
>> >> C
>> >>
>> >> On Tue, May 8, 2012 at 3:21 AM, Narayanan A R
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > Does ZK fit well for coordination across data centers?
>> >>
Narayanan A R 2012-05-09, 20:52