Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Question about multi rack/DC support


Copy link to this message
-
Re: Question about multi rack/DC support
Cameron McKenzie 2014-01-12, 23:10
I'm sure that someone more knowledgable will be able to provide more
insight, but in the meantime...

I don't believe that it's possible to have a configuration that will run as
both a cluster and a standalone. I think that in your case of ONE rack
running, you would need to manually reconfigure this node so that it's a
cluster with only one instance. Then reconfigure it when the other racks
come up. There are a few issues with this though.
1.) You have potential data loss. When your other 2 racks have gone down,
writes may not have been propogated to your remaining rack. Whether this is
acceptable or not depends on the data you're storing I guess.
2.) You need to be careful when bring the other racks back into the
cluster. If the two other racks start up again prior to reconfiguring your
single node master, they may form a quorum amongst themselves, and this
would cause any data that was written to the single instance while they
were down to be discarded. So, you would need to reconfigure the single
instance to once again be part of the triplex, restart this node, then
start one of the other racks, wait for a quorum to be formed, and then
start up the third rack.

That's based on my limited understanding anyway
cheers
Cam
On Sun, Jan 12, 2014 at 6:59 AM, Mark Farnan <[EMAIL PROTECTED]> wrote:

> Howdy,
>
> I would like some guidance about designing a zookeeper architecture to
> support the following multi site scenario.
> Based on what I read about the way elections work,  I’m concerned the
> below won’t be possible.
>
>
>
> Scenario:
> 1. I have a need to run some applications, which use zookeeper for state
> (Kafka for Example).  And run it as a single system, across multiple sites.
>
> 2. Environment consists of 3 sets of servers,  Call them Rack A, B & C.
>  Each set consists of a rack of equipment with multiple (20+) machines in
> it.
>
> 3. Each Set/Rack is located in a different  part of the building complex,
>      but ALL are connected by high speed network. (1gb/s or better).
>
> 4.  In NORMAL operation,  ALL 3 Rack’s will be up and running,
>
> 5.  In a failure OR maintenance scenario,   it can reduce to 2 racks
> running, (1 shutdown),  OR  only 1 Rack running (2 racks shutdown).
>         Racks are shut down relatively regularly for maintenance and
> patching, (at least every couple of months) or for Disaster testing.
>
>
> I can’t change the network configuration, and have limited control over
> when they shutdown various racks for maintenance.
> Note 1:    This is not an abstract problem,  the setup exists, and there
> is a similar  same setup, but only 2 Racks (A & B),  in at least 2 other
> sites.
>
> Note 2:         There are not other ‘sites/racks’  in which to put some
> kind of election broker node.  I only have these 3 locations to choose from.
>
>
> Question is:  Is it possible to run Zookeerer in this environment,
>  supporting  ONE rack running (hard), and also both ‘everything running’
> (easy),  ‘2 Racks running’ (easy)
>
>
> All Assistance gratefully appreciated
>
> If anyone has run a similar setup and would like to discuss further,  I’d
> be more than happy to
>
> Regards
>
> Mark.
>
>
>
>
>
>
>
>
>
>