Cameron has it right on. You can't automatically detect the difference
between "the other two racks are down for maintenance" and "I am
partitioned from the other two racks". In the latter case, you will have
data loss and split-brain if you automatically convert to a single server
cluster because, if the other two racks could still communicate with each
other, they would also believe themselves to be a valid cluster. Depending
on which ZK quorum member the client was connected to it would see
He's also right that this will be nasty to reconfigure because if the other
2 racks (A and B) come up and believe themselves to be members of a
potential quorum they can still form a quorum with each other, even if you
have converted rack C to run as a standalone.
This problem will be harder to handle in the other sites with only 2 racks,
as however you configure yourself you will always have the risk of the
cluster being down due to not having quorum (if the rack with the majority
of nodes goes away), or being down because all the nodes are down (if all
nodes only live on one rack).
On Sun, Jan 12, 2014 at 6:10 PM, Cameron McKenzie <[EMAIL PROTECTED]>wrote:
> I'm sure that someone more knowledgable will be able to provide more
> insight, but in the meantime...
> I don't believe that it's possible to have a configuration that will run as
> both a cluster and a standalone. I think that in your case of ONE rack
> running, you would need to manually reconfigure this node so that it's a
> cluster with only one instance. Then reconfigure it when the other racks
> come up. There are a few issues with this though.
> 1.) You have potential data loss. When your other 2 racks have gone down,
> writes may not have been propogated to your remaining rack. Whether this is
> acceptable or not depends on the data you're storing I guess.
> 2.) You need to be careful when bring the other racks back into the
> cluster. If the two other racks start up again prior to reconfiguring your
> single node master, they may form a quorum amongst themselves, and this
> would cause any data that was written to the single instance while they
> were down to be discarded. So, you would need to reconfigure the single
> instance to once again be part of the triplex, restart this node, then
> start one of the other racks, wait for a quorum to be formed, and then
> start up the third rack.
> That's based on my limited understanding anyway
> On Sun, Jan 12, 2014 at 6:59 AM, Mark Farnan <[EMAIL PROTECTED]>
> > Howdy,
> > I would like some guidance about designing a zookeeper architecture to
> > support the following multi site scenario.
> > Based on what I read about the way elections work, I’m concerned the
> > below won’t be possible.
> > Scenario:
> > 1. I have a need to run some applications, which use zookeeper for state
> > (Kafka for Example). And run it as a single system, across multiple
> > 2. Environment consists of 3 sets of servers, Call them Rack A, B & C.
> > Each set consists of a rack of equipment with multiple (20+) machines in
> > it.
> > 3. Each Set/Rack is located in a different part of the building complex,
> > but ALL are connected by high speed network. (1gb/s or better).
> > 4. In NORMAL operation, ALL 3 Rack’s will be up and running,
> > 5. In a failure OR maintenance scenario, it can reduce to 2 racks
> > running, (1 shutdown), OR only 1 Rack running (2 racks shutdown).
> > Racks are shut down relatively regularly for maintenance and
> > patching, (at least every couple of months) or for Disaster testing.
> > I can’t change the network configuration, and have limited control over
> > when they shutdown various racks for maintenance.
> > Note 1: This is not an abstract problem, the setup exists, and there
> > is a similar same setup, but only 2 Racks (A & B), in at least 2 other
> > sites.
> > Note 2: There are not other ‘sites/racks’ in which to put some