Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> send UPTODATE to follower until a quorum of servers synced with leader


Copy link to this message
-
Re: send UPTODATE to follower until a quorum of servers synced with leader
sorry, i'm behind on my email. you are correct :)

ben

On Mon, Mar 28, 2011 at 1:03 PM, Fournier, Camille F. [Tech] <
[EMAIL PROTECTED]> wrote:

> I take that back. Right after the UPTODATE send in LearnerHandler, we wait
> for the final ACK from that follower and call processAck on that packet. We
> need that ack set to reach a quorum set before we start up the Leader
> ZooKeeperServer. Until that is started, we won’t process REVALIDATE requests
> and we won’t accept connections ourselves (so clients can’t connect to us to
> revalidate their session). So I think we are ok.
>
>
>
> C
>
>
>
> *From:* Fournier, Camille F. [Tech]
> *Sent:* Monday, March 28, 2011 3:34 PM
> *To:* '[EMAIL PROTECTED]'
> *Subject:* RE: send UPTODATE to follower until a quorum of servers synced
> with leader
>
>
>
> Looking at the code it looks like we don’t need a synched quorum to accept
> a new client session, just a quorum in the process of synching, so I don’t
> think the session handling will solve this. I suppose it’s a warning that
> correctness for n=3 doesn’t extend to all possible cluster sizes of N.
>
> Definitely worth opening a JIRA.
>
>
>
> C
>
>
>
> *From:* Flavio Junqueira [mailto:[EMAIL PROTECTED]]
> *Sent:* Monday, March 28, 2011 11:49 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: send UPTODATE to follower until a quorum of servers synced
> with leader
>
>
>
> Hi Jiangwen, Good catch. I followed the code and it does sound like this
> scenario can happen, ignoring how sessions are handled. I checked that a
> follower takes a snapshot and starts a zookeeper server right after
> receiving an UPTODATE message. I'm not clear, though, if it is possible for
> a client to revalidate a session while the leader hasn't started. I was
> discussing with Ben offline and it sounds like we do not necessarily wait
> for a leader to come up to revalidate sessions. I'm not so familiar with the
> session handling part of the code, so I'll let perhaps Ben or someone else
> add to this discussion.
>
>
>
> In any case, you might want to open a jira to track our comments so that we
> don't miss important comments. I also wanted to point out that we have been
> observing a few corner cases like the one you raised, and we have been
> designing changes to the implementation that take care of such problems. If
> I'm not mistaken, the scenario you point out wouldn't happen under our
> changes because followers would wait for a commit message (wait for a quorum
> to ack) before starting a server, as you point out. The latest notes on the
> design are under Zab1.0 in the ZooKeeper wiki.
>
>
>
> Thanks,
>
> -Flavio
>
>
>
>
>
> On Mar 28, 2011, at 10:24 AM, jiangwen w wrote:
>
>
>
> 1. current process
> when leader fail, a new leader will be elected, followers will sync with
> the
> new leader.
> After synced, leader send UPTODATE to follower.
>
> 2. a corner case
> but there is a corner case, things will go wrong.
> suppose message M only exists on leader, after a follower synced with
> leader, the client connected to the follower will see M.
> but it only exists on two servers, not on a quorum of servers. If the new
> leader and the follower failed, message M is lost, but M is already seen by
> client.
>
> 3. one solution
> So I think UPTODATE  can be sent to follower only when a quorum of server
> synced with the leader.
>
> Sincerely
>
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> [EMAIL PROTECTED]
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300    fax (408) 349 3301
>
>
>
>