Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> what would happen with this case ? (ZAB protocol question)


+
Yang 2011-07-19, 21:44
+
Yang 2011-07-20, 07:28
+
Alexander Shraer 2011-07-21, 18:04
+
Alexander Shraer 2011-07-21, 20:11
+
Ted Dunning 2011-07-21, 20:24
Copy link to this message
-
RE: what would happen with this case ? (ZAB protocol question)
Hi Ted,

In your scenario there is no problem I can see. The problem is in another scenario I described in the JIRA - there C has seen more proposals than B but B has seen more commits than C. When leader election happens (and assuming they don't restart beforehand), B will be elected as leader and not C, which is a problem because C's suffix of transactions which were acked by both A and C will be truncated.

Alex

> -----Original Message-----
> From: Ted Dunning [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, July 21, 2011 1:25 PM
> To: [EMAIL PROTECTED]
> Cc: Yang
> Subject: Re: what would happen with this case ? (ZAB protocol question)
>
> Alex,
>
> Are you sure that this is a bug.
>
> Take the case of three servers A, B and C with A being leader.
>
> If transactions 1, 2 and 3 are committed, then a majority of the nodes,
> including at least A, must have seen these transactions.  Moreover,
> transactions cannot be committed on a node unless all previous transactions
> have been seen on that node as well.  Thus, by symmetry, we can consider
> cases where B alone committed these transactions or where B and C committed
> them.  Only the first case is problematic.
>
> Now, assume further that transaction 4 has arrived at B and been forwarded
> to A but neither B nor C have committed to it.
>
> The situation now is that in this first epoch, A has seen 1-4, B has seen
> 1-3 and C has seen nothing.  At least two nodes know the current epoch
> because we obviously have a quorum and we know that B knows the current
> epoch because it has seen transactions from this epoch.  Thus the collection
> of machines that know the current epoch can be A+B or A+B+C.
>
> IF all three nodes now die simultaneously and B and C come back up, the
> question is what will happen.  We know that the two nodes will agree on the
> epoch because at least B has the last epoch.  Node B will be elected leader
> because it has seen later transactions than C.  C will now get the
> transactions and we have a quorum in a new epoch.
>
> If A returns at this point, it will know about transactions 1, 2, 3 and 4.
>  Further, it will know that 1, 2, and 3 have been committed in the first
> epoch and that 4 was proposed, but never committed.  As it joins, it will
> find that a new epoch has started and will recognize B as master.  B will
> tell it to truncate the log by deleting 4, but 4 was never committed anyway.
>
> Where is the problem?
>
> On Thu, Jul 21, 2011 at 1:11 PM, Alexander Shraer <shralex@yahoo-
> inc.com>wrote:
>
> > The problem is in leader election - if the server doesn't reboot before
> > running leader election (the usual case)  then only the transactions for
> > which it received a commit count and it might not be elected leader, even if
> > it has seen more transactions than the others. This may lead to transactions
> > being dropped.
> >
> > I opened a JIRA for this.
> >
+
Ted Dunning 2011-07-21, 22:09
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB