-RE: Questions on Zab phases
Alexander Shraer 2011-12-08, 21:32
I'm not an author of the paper you mention, but I might be able to answer.
The protocol implemented in the code is described here: https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0
In the ZAB DSN paper, any follower can start running leader election. In the discovery phase
the candidate-leader learns the most uptodate history from one of its followers and then uses that history in the synchronization phase.
To reduce the contention of multiple leader-candidates as well as to eliminate the need for the followers to send their histories to the candidate-leader in discovery phase, the implementation has a preliminary phase - the FastLeaderElection (this is an optimization). A follower will start running discovery phase only if a quorum support its candidacy in the preliminary phase. In the preliminary phase, a follower votes for another follower only if it has a higher currentEpoch, called f.a in the ZAB paper (and higher zxid if the currentEpoch is the same). So, the candidate leader running discovery should already have the most uptodate history and doesn't have to copy it from others. In the discovery phase the candidate leader just makes sure that it indeed has the most uptodate history, and otherwise restarts the entire process.
The discovery phase is done in LearnerHandler.java (look for FOLLOWERINFO to see where it starts). The check I mentioned above is in Leader.java, waitForEpochAck().
The commit message of the NEWLEADER is the UPTODATE message
BTW, in the paper the chosen history is included in the NEWLEADER message. In practice the follower is sent only what it is missing or a snapshot if it's too far behind. The NEWLEADER message just completes the transfer.
Hope this helps.
> -----Original Message-----
> From: de Souza Medeiros Andre [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, December 08, 2011 12:39 PM
> To: [EMAIL PROTECTED]
> Subject: Questions on Zab phases
> Dear Zookeeper developers,
> I am a student writing a report on the internals of Zookeeper. I have
> been reading the papers as well as the source code of Zookeeper and
> I have some basic questions to ask.
> First of all, the paper:
> "Junqueira, F.P.; Reed, B.C.; Serafini, M.:
> Zab: High-performance broadcast for primary-backup systems.
> IEEE/IFIP 41st International Conference on
> Dependable Systems & Networks (DSN), 2011."
> discusses Zab as first doing a leader election and followed
> by three phases:
> 1. Discovery
> 2. Synchronization
> 3. Broadcast
> I have also been reading the Zookeeper source code, and
> could locate the following parts:
> - Fast Leader Election (by default)
> - Synchronization
> - Broadcast
> It seem like Fast Leader Election is substituting both
> Leader Election and Discovery phase of the protocol in
> the paper. Is my understanding correct, and what are the
> guarantees assumed by the synchronization phase on earlier
> phases of the Zab pipeline?
> On a sidenote, I was also quickly looking at the
> Synchronization phase in the code and could not locate
> the Commit message for new leader described in the paper.
> Was this an oversight on my part?
> Thank you,
> Andre Medeiros