Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> Re: [jira] Commented: (ZOOKEEPER-975) new peer goes in LEADING state even if ensemble is online

Copy link to this message
Re: [jira] Commented: (ZOOKEEPER-975) new peer goes in LEADING state even if ensemble is online

"Vishal K (JIRA)" <[EMAIL PROTECTED]> wrote:
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004273#comment-13004273 ]

Vishal K commented on ZOOKEEPER-975:

Hi Flavio,

I have a patch for this, but I have it on the top of the fix for ZOOKEEPER-932. We have 932 applied to our ZK code since we need it. Until ZOOKEEPER-932 is reviewed and committed, I will have to keep back porting patches (and do double testing). I will port my changes to trunk if someone requires a fix for the bug. Since this is not a blocker, I am going to hold off the patch until 932 is reviewed. That will reduce my testing and porting overhead. Does that sound ok?

The patch I have is good only for FLE.

About maintenance, we have some time back talked about maintaining only the TCP version of FLE (FLE+QCM). There was never some real pressure to eliminate the others, and in fact previously some users were still using LE. I'm all for maintaining only FLE, but we need to hear the opinion of others. More thoughts?

The documentation says: "The implementations of leader election 1 and 2 are currently not supported, and we have the intention of deprecating them in the near future. Implementations 0 and 3 are currently supported, and we plan to keep supporting them in the near future. To avoid having to support multiple versions of leader election unecessarily, we may eventually consider deprecating algorithm 0 as well, but we will plan according to the needs of the community."

Is there a significant advantage of using LE 0 vs LE 3?

> new peer goes in LEADING state even if ensemble is online
> ---------------------------------------------------------
>                 Key: ZOOKEEPER-975
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-975
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.2
>            Reporter: Vishal K
>             Fix For: 3.4.0
>         Attachments: ZOOKEEPER-975.patch
> Scenario:
> 1. 2 of the 3 ZK nodes are online
> 2. Third node is attempting to join
> 3. Third node unnecessarily goes in "LEADING" state
> 4. Then third goes back to LOOKING (no majority of followers) and finally goes to FOLLOWING state.
> While going through the logs I noticed that a peer C that is trying to
> join an already formed cluster goes in LEADING state. This is because
> QuorumCnxManager of A and B sends the entire history of notification
> messages to C. C receives the notification messages that were
> exchanged between A and B when they were forming the cluster.
> In FastLeaderElection.lookForLeader(), due to the following piece of
> code, C quits lookForLeader assuming that it is supposed to lead.
> 740                             //If have received from all nodes, then terminate
> 741                             if ((self.getVotingView().size() == recvset.size()) &&
> 742                                     (self.getQuorumVerifier().getWeight(proposedLeader) != 0)){
> 743                                 self.setPeerState((proposedLeader == self.getId()) ?
> 744                                         ServerState.LEADING: learningState());
> 745                                 leaveInstance();
> 746                                 return new Vote(proposedLeader, proposedZxid);
> 747
> 748                             } else if (termPredicate(recvset,
> This can cause:
> 1.  C to unnecessarily go in LEADING state and wait for tickTime * initLimit and then restart the FLE.
> 2. C waits for 200 ms (finalizeWait) and then considers whatever
> notifications it has received to make a decision. C could potentially
> decide to follow an old leader, fail to connect to the leader, and
> then restart FLE. See code below.
> 752                             if (termPredicate(recvset,
> 753                                     new Vote(proposedLeader, proposedZxid,

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira