Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # dev - peer goes in LEADING state even if ensemble is online


Copy link to this message
-
Re: peer goes in LEADING state even if ensemble is online
Vishal Kher 2011-01-15, 04:52
Folks,

Opened a jira for this: https://issues.apache.org/jira/browse/ZOOKEEPER-975

Please let me know if my proposal seems ok. I will do the change once we
agree.

Thanks.

On Wed, Jan 12, 2011 at 10:00 PM, Mahadev Konar <[EMAIL PROTECTED]>wrote:

> Forwarding it to the dev list.
>
> Thanks
> mahadev
>
>
> On 1/11/11 11:34 PM, "Vishal Kher" <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > Scenario:
> > 1. 2 of the 3 ZK nodes are online
> > 2. Third node is attempting to join
> > 3. Third node unnecessarily goes in "LEADING" state
> > 4. Then third goes back to LOOKING (no majority of followers) and finally
> > goes to FOLLOWING state.
> >
> > While going through the logs I noticed that a peer C that is trying to
> join
> > an already formed cluster goes in LEADING state. This is because
> > QuorumCnxManager of A and B sends the entire history of notification
> > messages to C.
> > C receives the notification messages that were exchanged between A and B
> > when they were forming the cluster.
> >
> > In FastLeaderElection.lookForLeader(), due to the following piece of
> code, C
> > quits lookForLeader assuming that it is supposed to lead.
> >
> > 740                             //If have received from all nodes, then
> > terminate
> > 741                             if ((self.getVotingView().size() => > recvset.size()) &&
> > 742
> > (self.getQuorumVerifier().getWeight(proposedLeader) != 0)){
> > 743                                 self.setPeerState((proposedLeader => > self.getId()) ?
> > 744                                         ServerState.LEADING:
> > learningState());
> > 745                                 leaveInstance();
> > 746                                 return new Vote(proposedLeader,
> > proposedZxid);
> > 747
> > 748                             } else if (termPredicate(recvset,
> >
> >
> > In general, this does not affect correctness of FLE since C will
> eventually
> > go back to FOLLOWING state (A and B won't vote for C). However, this
> delays
> > C from joining the cluster. This can in turn affect recovery time of an
> > application.
> >
> > I think A and B should send only the latest notification (most recent)
> > instead of the entire history. Does this sound resonable?
> >
> > Thanks.
> > -Vishal
> >
>
>