Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Re: Review Request 11583: HDFS Sink should check if file is closed and retry if it is not.


Copy link to this message
-
Re: Review Request 11583: HDFS Sink should check if file is closed and retry if it is not.
Hari Shreedharan 2013-07-10, 00:24

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11583/#review22940
-----------------------------------------------------------
This looks good Ted. Only minor feedback - and one request - to change the closeRetries to closeTries, which defaults to 1 (and if it is set to <1, we set it back to 1 anyway). Also could you please update the user guide?
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46725>

    Is this used?

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46718>

    Lets make this one private - are you using this one from outside this class?

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46719>

    Can you change this one to "hdfs.closeTries" to make it similar to the others? Also, instead of number of close retries, let's make it number of close tries - so if this is 3 - total number of times close is called is 3.

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46720>

    The default in HDFS Sink for callTimeout is 10000 - I think we should use the same default here too.

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46723>

    Unnecessary newline

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46721>

    Hmm, if number of close retries is set to zero, then the timeBetweenCloseRetries should have no effect right? So we should "unset" the timeBetweenCloseRetries if numberOfCloseRetries < 0 - maybe set it to INT_MAX? But this really does not matter, since the closeHDFSOutputStream checks if we should retry.

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46722>

    Why so many newlines?

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46726>

    If we change this to number of close tries, then this should also take care of that, so we call close() method only that many times.
    
    To keep this code the same we should perhaps set numberOfCloseRetries to the value from the config - 1, but also check that the file is not closed before logging the error (to handle the case where the closeTries is set to 1)

flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/11583/#comment46724>

    unnecessary newlines
- Hari Shreedharan
On July 2, 2013, 1:18 p.m., Ted Malaska wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11583/
> -----------------------------------------------------------
>
> (Updated July 2, 2013, 1:18 p.m.)
>
>
> Review request for Flume.
>
>
> Bugs: 2007
>     https://issues.apache.org/jira/browse/2007
>
>
> Repository: flume-git
>
>
> Description
> -------
>
> We can use the new API added in HDFS-4525. We will need to use reflection though, so we can run against a version of HDFS which does not have this API.
>
>
> Diffs
> -----
>
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java bc3b383
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 2c2be6a
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java b8214be