Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Review Request 15779: Flume-2245 HDFS Sink BucketWriter failing to close after datanode issues


Copy link to this message
-
Review Request 15779: Flume-2245 HDFS Sink BucketWriter failing to close after datanode issues

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15779/
-----------------------------------------------------------

Review request for Flume.
Repository: flume-git
Description
-------

https://issues.apache.org/jira/browse/FLUME-2245

Originally the flush() seemed superfluous however without it one of the unit tests breaks.

By moving on beyond regardless of the flush succeeding or not we allow the backing stream to actually get closed and reopened. While the real problem is with the HDFS stream not recovering this workaround seems necessary as otherwise appends will continue to fail until a restart.

Similarly HDFSDataStream and HDFSCompressedDataStream are closed regardless of the success of serialization/flushing. The exception should be propagated and cause a rollback so no data loss occurs.
Diffs
-----

  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 200d457
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee

Diff: https://reviews.apache.org/r/15779/diff/
Testing
-------

Existing unit tests pass.

I'm still trying to figure out a way to recreate the issue as it is hard to determine the exact cause
Thanks,

Juhani Connolly