Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Distributed ZooKeeper cluster design


Copy link to this message
-
Re: Distributed ZooKeeper cluster design
By zksmoketest I meant zk-latencies, which is in that same github repo.

On Tue, Dec 13, 2011 at 10:44 AM, Camille Fournier <[EMAIL PROTECTED]>wrote:

> Ted is of course right, but to speculate:
>
> The idea you had with 3 in C, one in A and one in B isn't bad, given
> some caveats.
>
> With 3 in C, as long as they are all available, quorum should live in
> C and you shouldn't have much slowdown from the remote servers in A
> and B. However, if you point your A servers only to the A zookeeper,
> you have a failover risk where your A servers will have no ZK if the
> sever in region A goes down (same with B, of course). If you have a
> lot of servers in the outer regions, this could be a risk. You are
> also giving up any kind of load balancing for the A and B region ZKs,
> which may not be important but is good to know.
>
> Another thing to be aware of is that the A and B region ZKs will have
> slower write response time due to the WAN cost, and they will tend to
> lag behind the majority cluster a bit. This shouldn't cause
> correctness issues but could impact client performance in those
> regions.
>
> Honestly, if you're doing a read-mostly workload in the A and B
> regions, I doubt this is a bad design. It's pretty easy to test ZK
> setups using Pat's zksmoketest utility, so you might try setting up
> the sample cluster and running some of the smoketests on it.
> (https://github.com/phunt/zk-smoketest/blob/master/zk-smoketest.py).
> You could maybe also add observers in the outer regions to improve
> client load balancing.
>
> C
>
>
>
> On Tue, Dec 13, 2011 at 9:05 AM, Ted Dunning <[EMAIL PROTECTED]>
> wrote:
> > Which option is preferred really depends on your needs.
> >
> > Those needs are likely to vary in read/write ratios, resistance to
> network
> > and so on.  You should also consider the possibility of observers in the
> > remote locations.  You might also consider separate ZK clusters in each
> > location with a special process to send mirrors of changes to these other
> > locations.
> >
> > A complete and detailed answer really isn't possible without knowing the
> > details of your application.  I generally don't like distributing a ZK
> > cluster across distant hosts because it makes everything slower and more
> > delicate, but I have heard of examples where that is exactly the right
> > answer.
> >
> > On Tue, Dec 13, 2011 at 4:29 AM, Dima Gutzeit
> > <[EMAIL PROTECTED]>wrote:
> >
> >> Dear list members,
> >>
> >> I have a question related to "suggested" way of working with ZooKeeper
> >> cluster from different geographical locations.
> >>
> >> Lets assume a service span across several regions, A, B and C, while C
> is
> >> defined as an element that the service can not live without and A and B
> are
> >> not critical.
> >>
> >> Option one:
> >>
> >> Having one cluster of several ZooKeeper nodes in one location (C) and
> >> accessing that from other locations A,B,C.
> >>
> >> Option two:
> >>
> >> Having ZooKeeper cluster span across all regions, i.e. 3 nodes in C,
> one in
> >> A and one in B. This way the clients resides in A,B will access the
> local
> >> ZooKeeper.
> >>
> >> Which option is preferred and which will work faster from client
> >> perspective ?
> >>
> >> Thanks in advance.
> >>
> >> Regards,
> >> Dima Gutzeit
> >>
>