Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Review Request: FLUME-1660 Close "idle" hdfs handles

Copy link to this message
Re: Review Request: FLUME-1660 Close "idle" hdfs handles

This is an automatically generated e-mail. To reply, visit:

(Updated Nov. 16, 2012, 2 a.m.)
Review request for Flume and Mike Percy.

Since no-one had other suggestions I addressed the issue mentioned in the last patch by using a flag to check if a bucket writer was closed for idling. I don't really like it, but to have consistency otherwise we'd need to have a lock shared over bucketwriter(for the bucketwriter map get/put section and the bucketWriter idleClose timer event).

Or we could go back to the original implementation which deals with most of this in HDFSEventSink. It breaks up encapsulation of the rolling functionality into multiple classes, but it feels less forced to me than this solution.

Feedback would be appreciated.

Either way, this works. With the borderline case of sleeping barely more than the idle timeout, often the unit test will attempt get an IOException because the process tries to use a bucketWriter that is in the process of getting decommissioned. This is not a problem as the next process() will work fine.

Added lastWrite to BucketWriter to verify when it was last updated

Added a thread to HDFSEventSink which verifies the last update of each active bucketWriter and closes them after the configurable timeout hdfs.closeIdleTimeout has passed.
This addresses bug FLUME-1660.
Diffs (updated)

  flume-ng-doc/sphinx/FlumeUserGuide.rst c1303e0
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 9f2c763
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java e369604
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 6a8072e
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java fee4c8b

Diff: https://reviews.apache.org/r/7659/diff/

Local machine testing was performed and the correct closing of files was confirmed, as well as the correct behavior of the configuration setting including disabling the feature(by using the default value for hdfs.closeIdleTimeout of 0)
There is one unrelated test failure which I'm not sure of(if anyone knows what's causing this, please let me know)

Failed tests:   testInOut(org.apache.flume.test.agent.TestFileChannel): Expected FILE_ROLL sink's dir to have only 1 child, but found 0 children. expected:<1> but was:<0>

All other tests pass.

Juhani Connolly