Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # dev - Review Request: FLUME-1762. File Channel should recover automatically if the checkpoint is incomplete or bad by deleting the contents of the checkpoint directory


+
Hari Shreedharan 2012-12-07, 05:29
+
Hari Shreedharan 2012-12-07, 06:16
Copy link to this message
-
Re: Review Request: FLUME-1762. File Channel should recover automatically if the checkpoint is incomplete or bad by deleting the contents of the checkpoint directory
Brock Noland 2012-12-07, 14:59

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8396/#review14147
-----------------------------------------------------------
Hari, I think this is a great patch!!  Nice work! Patch looks great, I have a few typical small comments below. Additionally, it looks like we are missing a couple test cases where we'd do full replay. It's certainly possible I am just missing the test, so excuse me if I am:

Bad Version of Checkpoint itself
Bad Version of Checkpoint meta
Differing write order ids between meta and checkpoint
Checkpoint marker incomplete
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/BadCheckpointException.java
<https://reviews.apache.org/r/8396/#comment30239>

    Nit: Javadoc comment without any text

flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/BadCheckpointException.java
<https://reviews.apache.org/r/8396/#comment30240>

    When we log this exception we print a very similar message. Additionally, this constructor is used when this statement is probably not true (ie the version is bad).
    
    What do you think about removing this constructor and then adding an appropriate error message each time?

flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
<https://reviews.apache.org/r/8396/#comment30241>

    I think we can delete the v1 stuff but we should probably do that in a different JIRA.

flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Serialization.java
<https://reviews.apache.org/r/8396/#comment30238>

    How about logging what files we are deleting just in case a file goes missing at some point?
- Brock Noland
On Dec. 7, 2012, 6:16 a.m., Hari Shreedharan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8396/
> -----------------------------------------------------------
>
> (Updated Dec. 7, 2012, 6:16 a.m.)
>
>
> Review request for Flume.
>
>
> Description
> -------
>
> Added code to throw a BadCheckpointException, if we can recover from the situation by deleting all files in the checkpoint directory. In the log class, during startup if BadCheckpointException is caught, all files are deleted and replay is retried.
>
>
> This addresses bug FLUME-1762.
>     https://issues.apache.org/jira/browse/FLUME-1762
>
>
> Diffs
> -----
>
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/BadCheckpointException.java PRE-CREATION
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFactory.java 6c07152
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFile.java 5eaf8c2
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV2.java 8bbc081
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV3.java c24f89f
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java 36553c5
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java 6d1cf51
>   flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Serialization.java ef8cf72
>   flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestEventQueueBackingStoreFactory.java b1a55be
>   flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelRestart.java 3f90805
>   flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java 0173390
>
> Diff: https://reviews.apache.org/r/8396/diff/
>
>
> Testing
> -------
>
> Added unit tests. Modified some existing unit tests to test for this change.
+
Hari Shreedharan 2012-12-07, 17:58
+
Brock Noland 2012-12-07, 17:11
+
Hari Shreedharan 2012-12-07, 17:58
+
Brock Noland 2012-12-07, 18:20
+
Hari Shreedharan 2012-12-07, 23:52
+
Brock Noland 2012-12-10, 16:10
+
Hari Shreedharan 2012-12-10, 17:40
+
Brock Noland 2012-12-10, 17:53