If you had a very heavy read load with a light write load, I think you should be able do this with observers as the regional tier. I solved the problem of needing a global service a different way (though api concepts) because of concerns around WAN traffic. Didn't do much server config tuning as a result. I thought a long time about whether I could use the hierarchical quorum to get around this but for my particular use case it wasn't useful. It would be interesting to see the limitations of a truly "global" ZK cluster deployment; my business is too sensitive to outage for me to be the one to roll those dice. Also I suspect it depends a lot on how good the pipes you have between your global datacenters are.
From: Oliver Wulff [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, May 04, 2011 3:44 PM
To: [EMAIL PROTECTED]
Subject: Re: Importance of latency in a global deployment
Looking forward for the feedback from Camille...
Maybe a crazy idea but couldn't we implement something similar like DNS. We
have one top level cluster (at least three servers) and then a child cluster
for each geographical region. The Zookeeper client communicates with the
local cluster only.
2011/5/4 Patrick Hunt <[EMAIL PROTECTED]>
> Camille did you tune any of the server configuration parameters? I
> think this would be interesting/useful for ppl.
> You are correct about write latency and issues wrt a client's server
> selection. This jira introduces the idea of allowing addl connection
> which for this case might be interesting - the client would attempt to
> connect to the "closest available server", fail over to a far server
> if necessary, but then keep checking for closer servers to become
> available over time (say the server recovers). Today you would fail
> over to another (potentially far) server, but never reconnect back to
> the closer server.
> On Wed, May 4, 2011 at 9:55 AM, Fournier, Camille F. [Tech]
> <[EMAIL PROTECTED]> wrote:
> > Global clusters will affect writes greatly, and may also affect you
> client reads in an indirect manner.
> > Writes, having to traverse from one region to another for purposes of
> voting, will be slowed down considerably by the ping time between regions.
> > If you did a three node deployment in the manner you mentioned, your
> clients may also suffer. Usually you would want to have a list of all
> available cluster members for your client to connect to, so if one is down
> or goes down the client can fail over to a running node. However, given that
> your client will have regional affinity for at most one of your servers, if
> you use the standard zk client connection logic your client either may be
> connected to a far region (slowing down all responses due to latency) or the
> client would have no failover node available should their close region node
> fail. If you choose to have clients able to connect to any node you may also
> have wan traffic considerations.
> > Some of the client side issues may be alleviated by using observers.
> > I've got deployments across regions to handle data center failure, but in
> all cases the off-region member is not available for client connections, and
> is kept from acting as leader to prevent slowness on writes.
> > C
> > ----- Original Message -----
> > From: Oliver Wulff <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> > Sent: Wed May 04 12:37:43 2011
> > Subject: Importance of latency in a global deployment
> > Hi there
> > I'm quite new to the zookeeper project and got a question regarding
> > robustness of the failover functionality in a global deployment.
> > Are there any pre-conditions how close the zookeeper servers must be to
> > other from a geographical distance point of view?
> > The reason is that the servers have to monitor and sync with each other