Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Failure scenarios and consequences


Copy link to this message
-
Re: Failure scenarios and consequences
Hi Jeremy,

Someone can correct me if I'm wrong here, but I believe I know the general
idea.

A write is not 'written' until a quorum of peers (can be any quorum)
acknowledge that they will commit the write.  Therefore, after the
partition, the old leader will not be able to get this quorum on any new
writes.  The new leader will obviously be able to get this quorum, and its
guaranteed that at least a single node in the new quorum that elected the
leader will know about every transaction that happened before the partition,
and replicate these transactions accordingly.

~Jared

On Thu, Dec 9, 2010 at 4:29 PM, Jeremy Hanna <[EMAIL PROTECTED]>wrote:

> I created a link off of the main wiki and the page itself:
> http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios
>
> Would someone please review it?  Specifically, I am curious to know about
> this:
> "if the leader is in the non-quorum side of the partition, that side of the
> partition will recognize that it no longer has a quorum of the ensemble. The
> leader will be demoted to being a regular ZooKeeper server and those nodes
> will no longer accept reads or writes."
> I just wanted to clarify - in the time for the non-quorum side to recognize
> it is no longer a quorum, will there ever be writes that get through?  Is it
> guaranteed that it won't accept writes after the partition?  I don't think
> that guarantee can exist, but wondered how to handle that.
>
> On Dec 9, 2010, at 2:04 PM, Mahadev Konar wrote:
>
> > Hi Jeremy,
> >   Responses in line below:
> >
> > On 12/9/10 11:53 AM, "Jeremy Hanna" <[EMAIL PROTECTED]> wrote:
> >
> > I looked around on the wiki and in the user list archives and couldn't
> find something definitive about certain failure scenarios.
> >
> > A partition splits the ensemble where a quorum is on one side of the
> partition
> > -- if the leader is on the quorum side of the partition, what happens to
> reads/writes that go to the non-quorum side?  I assume writes return errors
> because it can't get to the leader.  Reads?
> >
> >> The reads will also fail on all the quorum nodes until a new quorum is
> elected.
> >
> > -- if the leader is on the non-quorum side of the partition, I would
> assume that the quorum side of the partition would elect a new leader for
> those clients on its side of the partition.  However, is there the
> possibility for the leader on the non-quorum side to accept writes before it
> realizes that there's no longer a quorum?  Just wondering about the
> possibility of corruption and then when the cluster syncs back up how the
> cluster would handle that data.
> >
> >> No there isnt. The leader relinquishes its right as a leader as soon as
> it realizes a quorum isnt committing the changes it proposed.
> >
> > (I would be happy to create a wiki page for failure scenarios if one
> doesn't exist that people could add to, but maybe this is just common
> knowledge.)
> >
> >> Please do!
> >
> > thanks
> > mahadev
>
>