Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Rolling config change considered harmful?


Copy link to this message
-
Re: Rolling config change considered harmful?
Thanks a lot Alex!
Excellent explanation :-)
On Sat, Jun 15, 2013 at 7:01 AM, Alexander Shraer <[EMAIL PROTECTED]> wrote:

> Hi German,
>
> During normal operation ZK guarantees that a quorum (any majority) of
> the ZK ensemble has all operations that may have been committed.
>
> So without dynamic reconfiguration you should ensure that when you're
> changing the ensemble, any possible quorum of the new ensemble
> necessarily intersects with any quorum of the 'old' ensemble.
>
> If you add D and E right away this property may not be guaranteed,
> since a new quorum (3 out of 5) can be for example C, D, E. Whereas
> its possible that only A and B have the latest state, so it'll be
> lost.
>
> To ensure this when going from 3 to 5 servers you should probably do 2
> transitions. First add
> server D. Any majority here, i.e., any 3 servers out of 4, will
> necessarily contain 2 servers from the original ensemble (A, B, C).
> So at least one server in any new quorum actually has the latest state.
>
> Then add E. A quorum here is any 3 out of 5 servers, so even if some
> quorum includes E and the server from (A, B, C, D) that
> didn't have the latest state, we still have 1 server in the quorum
> that does have the latest state, so we're fine...
>
> As long as you add servers one by one and wait for leader election to
> complete in every stage, or preserve the quorum intersection property
> in some other way, it should be safe. But with dynamic reconfig you
> don't need to do that and no reboots necessary of course.
>
> Alex
>
> On Fri, Jun 14, 2013 at 9:14 PM, German Blanco
> <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > Could you please clarify if this thread is about a rolling start in an
> > ensemble without the dynamic reconfiguration support?
> > And when you say "Create a 5 node ensemble", that means quorum is 5. But
> > then you give server lists with only 3 servers in each node?
> > If the server list has 3 servers, then quorum is actually 3 and what is
> > described may happen in that scenario.
> > In that case C follows B, E follows D and A follows either B or D and
> there
> > are two working ensembles.
> > It should be possible to create problems, even with more standard
> > configuration changes:
> > If we want to change a quorum of three to a quorum of five {A,B,C} to
> > {A,B,C,D,E}:
> > - First the configuration is changed in all the nodes, but they are not
> > restarted. Only A, B and C are running.
> > - One of them is stopped (e.g. A).
> > - At this point, if A, D and E are started with the new configuration,
> they
> > may elect a leader before any of them is aware of either B or C, form an
> > ensemble and start serving txns.
> > - However, if A is started, we wait until it connects to the leader of B
> > and C, and then D and E are started and then B and C are restarted,
> > everything should be ok. The fact that this depends on the human ability
> to
> > start D and E while A,B and C are connected to the ensemble seems a bit
> > risky though.
> > I have found a presentation on the topic:
> >
> http://www.slideshare.net/Hadoop_Summit/dynamic-reconfiguration-of-zookeeper
> >
> > If anybody knows of a safer way to change a quorum of 3 to a quorum of 5
> > with e.g. zookeeper 3.4.5, please point it out.
> >
> > Regards,
> >
> > Germán.
> >
> >
> > On Fri, Jun 14, 2013 at 11:46 PM, Jordan Zimmerman <
> > [EMAIL PROTECTED]> wrote:
> >
> >> I got the test cluster into the state described with 2 leaders. I then
> >> allocated 100 Curator clients to write nodes "/n" where n is the index
> >> (i.e. "/0", "/1", …). The idea that the nodes would be distributed
> around
> >> the cluster instances. I then allocated a single Curator instance
> dedicated
> >> to one of the servers instance, did a sync, and did an exists() to
> verify
> >> that each cluster instances had all the nodes. For the 2 leader cluster,
> >> this fails.
> >>
> >> -JZ
> >>
> >> On Jun 14, 2013, at 1:54 PM, "FPJ" <[EMAIL PROTECTED]> wrote: