Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Rolling config change considered harmful?

Copy link to this message
Re: Rolling config change considered harmful?
Hi German,

During normal operation ZK guarantees that a quorum (any majority) of
the ZK ensemble has all operations that may have been committed.

So without dynamic reconfiguration you should ensure that when you're
changing the ensemble, any possible quorum of the new ensemble
necessarily intersects with any quorum of the 'old' ensemble.

If you add D and E right away this property may not be guaranteed,
since a new quorum (3 out of 5) can be for example C, D, E. Whereas
its possible that only A and B have the latest state, so it'll be

To ensure this when going from 3 to 5 servers you should probably do 2
transitions. First add
server D. Any majority here, i.e., any 3 servers out of 4, will
necessarily contain 2 servers from the original ensemble (A, B, C).
So at least one server in any new quorum actually has the latest state.

Then add E. A quorum here is any 3 out of 5 servers, so even if some
quorum includes E and the server from (A, B, C, D) that
didn't have the latest state, we still have 1 server in the quorum
that does have the latest state, so we're fine...

As long as you add servers one by one and wait for leader election to
complete in every stage, or preserve the quorum intersection property
in some other way, it should be safe. But with dynamic reconfig you
don't need to do that and no reboots necessary of course.


On Fri, Jun 14, 2013 at 9:14 PM, German Blanco
> Hello,
> Could you please clarify if this thread is about a rolling start in an
> ensemble without the dynamic reconfiguration support?
> And when you say "Create a 5 node ensemble", that means quorum is 5. But
> then you give server lists with only 3 servers in each node?
> If the server list has 3 servers, then quorum is actually 3 and what is
> described may happen in that scenario.
> In that case C follows B, E follows D and A follows either B or D and there
> are two working ensembles.
> It should be possible to create problems, even with more standard
> configuration changes:
> If we want to change a quorum of three to a quorum of five {A,B,C} to
> {A,B,C,D,E}:
> - First the configuration is changed in all the nodes, but they are not
> restarted. Only A, B and C are running.
> - One of them is stopped (e.g. A).
> - At this point, if A, D and E are started with the new configuration, they
> may elect a leader before any of them is aware of either B or C, form an
> ensemble and start serving txns.
> - However, if A is started, we wait until it connects to the leader of B
> and C, and then D and E are started and then B and C are restarted,
> everything should be ok. The fact that this depends on the human ability to
> start D and E while A,B and C are connected to the ensemble seems a bit
> risky though.
> I have found a presentation on the topic:
> http://www.slideshare.net/Hadoop_Summit/dynamic-reconfiguration-of-zookeeper
> If anybody knows of a safer way to change a quorum of 3 to a quorum of 5
> with e.g. zookeeper 3.4.5, please point it out.
> Regards,
> Germán.
> On Fri, Jun 14, 2013 at 11:46 PM, Jordan Zimmerman <
>> I got the test cluster into the state described with 2 leaders. I then
>> allocated 100 Curator clients to write nodes "/n" where n is the index
>> (i.e. "/0", "/1", …). The idea that the nodes would be distributed around
>> the cluster instances. I then allocated a single Curator instance dedicated
>> to one of the servers instance, did a sync, and did an exists() to verify
>> that each cluster instances had all the nodes. For the 2 leader cluster,
>> this fails.
>> -JZ
>> On Jun 14, 2013, at 1:54 PM, "FPJ" <[EMAIL PROTECTED]> wrote:
>> > I messed up the last sentence, here is what I was trying to say:
>> >
>> > It is ok to have two servers thinking they are leaders as long as only
>> one
>> > is
>> > able to commit txns at a time by having a quorum of supporters. Each
>> server
>> > is going to follow a single leader, so I don't see a problem in your