Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Problems about Zab protocol

daidong 2011-04-20, 09:37
Alexander Shraer 2011-04-20, 18:29
Benjamin Reed 2011-04-20, 20:26
André Oriani 2011-04-20, 23:20
daidong 2011-04-21, 06:33
daidong 2011-04-21, 06:35
Flavio Junqueira 2011-04-21, 07:48
daidong 2011-04-21, 06:30
Alexander Shraer 2011-04-21, 19:53
daidong 2011-04-23, 04:55
Copy link to this message
Re: Problems about Zab protocol
Daidong, There are several key differences between distributed  
transactions and the replication problem we solve in ZooKeeper, and if  
you are interested in understanding them, you might start by having a  
look at the Paxos Commit work of Gray and Lamport. They have a TR  
available online, just use your favorite search engine.


On Apr 23, 2011, at 6:55 AM, daidong wrote:

> Hi, Alex
> Thanks for your reply and Flavio's
> I think i finally get the idea. :)
> Would it be appropriate to see the ZAB as a 3PC without the READY/
> WAIT status? As all the participators will reply VOTE_COMMIT (they  
> do not abort...).
> I will read the source code and hope can do some stuff with ZAB.  
> Thanks a lot for all the replies.
> --
> daidong
> On 2011年4月22日星期五 at 上午3:54, Alexander Shraer [via  
> zookeeper-user] wrote:
>> Hi Daidong,
>> In addition to Flavio's response, I'll try to address some of your  
>> specific questions.
>>> In my opinion, an atomic broadcast protocol must guarantee all the  
>>> non-
>>> faulty servers have the same status eventually. So in the 2PC  
>>> protocol,
>>> the coordinator must block until "all" the servers reply "ok".
>> Designed this way, the protocol wouldn't be able to tolerate any  
>> failures - the leader could block
>> waiting for a response from a server that had crashed. The idea is  
>> to receive enough "ok" messages
>> to guarantee that even if a minority of servers crash, the  
>> information is still not lost. That's why
>> the leader waits for a majority of acks. Messages are still sent to  
>> all followers, so they will eventually
>> get them (or if they disconnect they will later reconnect and synch  
>> with the leader automatically).
>> Regarding your second question - formally, sequential consistency  
>> guarantees that operations of each client take effect in the order
>> they were submitted by the client - so a client's read is  
>> guaranteed to see its own last complete write.
>> In the example you mention, the client first executes a create()  
>> and then getChildren(). If clients C1 and C2 both submit a create()
>> concurrently, one of these requests will reach the leader and will  
>> be scheduled by the leader before the other one, suppose the  
>> create() request of C1.
>> Then, when C2 is notified about the completion of its own create,  
>> FIFO ensures that it also finds out about any operation that  
>> completed before that create()
>> (these messages were sent by the leader earlier). So when C2  
>> finally runs getChildren(), its local state will already have every  
>> operation that was scheduled
>> by the leader before its own create() completed.
>> In general, ZAB implements state-machine replication by executing  
>> consensus on each operation. To understand the general idea,
>> I recommend reading Lamport's "Paxos made simple" paper I sent  
>> earlier - it has a constructive explanation of this
>> (although the algorithm is somewhat different from ZAB).
>> Alex
>>> -----Original Message-----
>>> From: daidong [mailto:]
>>> Sent: Wednesday, April 20, 2011 11:31 PM
>>> To: [hidden email]
>>> Subject: Re: RE: Problems about Zab protocol
>>> Hi, Alex
>>> Thanks for your reply. :)
>>> I knew ZAB has two modes, but things i do not quit understand  
>>> focus on
>>> the broadcast mode. In the ZAB paper, authors said ZAB is a simple
>>> version of two phases commit protocol because we don't have abort
>>> actions in followers. I do not quit understand this.
>>> In my opinion, an atomic broadcast protocol must guarantee all the  
>>> non-
>>> faulty servers have the same status eventually. So in the 2PC  
>>> protocol,
>>> the coordinator must block until "all" the servers reply "ok". If  
>>> there
>>> is not any abort too, consider the situation that we have a very  
>>> slow
>>> follower F who processes messages slower than other followers.
>>> According TCP and FIFO channel, We can say all the messages will be


research scientist

direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301