Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> Review Request: ZOOKEEPER-1413: Use on-disk transaction log for learner sync up


Copy link to this message
-
Re: Review Request: ZOOKEEPER-1413: Use on-disk transaction log for learner sync up


> On May 24, 2013, 12:19 a.m., fpj wrote:
> > /src/java/main/org/apache/zookeeper/server/TxnLogProposalIterator.java, line 67
> > <https://reviews.apache.org/r/11231/diff/1/?file=293955#file293955line67>
> >
> >     Ideally we propagate the exception up, but it is incompatible with the signature of the method unless we throw NoSuchElementException. What if we return null in the case of an exception?
>
> Thawan Kooburat wrote:
>     IOException can be thrown from  
>    
>     hasNext = itr.next();
>    
>     which for prefetching the next element.
>    
>     I think we only need to set hasNext to false when we see exception, so the client won't try to call next() again
>
>
> Raul Gutierrez Segales wrote:
>     Record#serialize() can also throw IOException. Are callers of this method expected to deal with a Proposal that isn't properly initialized (i.e.: won't have the QuorumPacket member won't be initialized so might have wrong zxid, type, etc)?
>
> Thawan Kooburat wrote:
>     itr.next() is the only place that actually perform IO-related operation.
>    
>     Record#serialize() can thrown IO exception if the output stream is writing to network or disk. However, in this case, we are writing it to memory because we are using BinaryOutputArchive/ByteArrayOutputStream.  I looked at the code, these implementation cannot throw IOException but they have to inherit the method signature of output stream.
>
> fpj wrote:
>     My concern here is that we simply log the error and keep making progress regularly, that's why I suggested we could try to propagate an exception up, but it is not completely straightforward because of the reasons I stated. If a server acks a txn but later cannot read it, then we might end up in an inconsistent state.

>From synchronization perspective, if the leader encounter a possible hole or corrupted txnlog, it will have to abort by clear the txn queue which is going to be sent to the learner. Then, it can fall back to sending a snapshot.

The txnlog facility need to be able to guarantee there properties but i don't think we have that yet.  I think we only do per-entry checksum and throws IOexception because of disk read, checksum or deserialization failure.  
- Thawan
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11231/#review20941
-----------------------------------------------------------
On May 20, 2013, 5:40 a.m., Thawan Kooburat wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11231/
> -----------------------------------------------------------
>
> (Updated May 20, 2013, 5:40 a.m.)
>
>
> Review request for zookeeper.
>
>
> Description
> -------
>
> ZOOKEEPER-1413: Use on-disk transaction log for learner sync up
>
> - Use txnlog for learner synchronization if learner fall too far behind
> - Refactoring LearnerHandler to deal with different cases of handling learner synchronization  
>
>
> This addresses bug https://issues.apache.org/jira/browse/ZOOKEEPER-1413.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/ZOOKEEPER-1413
>
>
> Diffs
> -----
>
>   /src/java/main/org/apache/zookeeper/server/TxnLogProposalIterator.java PRE-CREATION
>   /src/java/main/org/apache/zookeeper/server/ZKDatabase.java 1483440
>   /src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java 1483440
>   /src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java 1483440
>   /src/java/main/org/apache/zookeeper/server/persistence/TxnLog.java 1483440
>   /src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 1483440
>   /src/java/test/org/apache/zookeeper/server/quorum/LearnerHandlerTest.java PRE-CREATION
>   /src/java/test/org/apache/zookeeper/test/FollowerResyncConcurrencyTest.java 1483440
>   /src/java/test/org/apache/zookeeper/test/GetProposalFromTxnTest.java PRE-CREATION