|
|
+
Justin Workman 2012-10-12, 22:51
+
Mike Percy 2012-10-13, 00:07
+
Roshan Naik 2012-10-18, 20:13
-
Re: [jira] [Commented] (FLUME-1350) HDFS file handle not closed properly when date bucketingJuhani Connolly 2012-10-19, 10:00
My implementation is synchronized on the writer map, and the append and
close operations on the bucketwriter are synchronized. It is possible for a writer to rarely get closed before it's about to append but that is harmless as it will just back off and get a fresh writer the next cycle. Also, if possible, please add comments to the jira thread when the mail is generated from there :) On 10/19/2012 05:13 AM, Roshan Naik wrote: > Will need to handle race conditions like.. a thread resumes writing > immediately after the watcher thread decides to close the file handle. In > that sense a deterministic close is nicer than a timeout based 'garbage > collection' > -roshan > > > On Thu, Oct 18, 2012 at 12:04 PM, Mike Percy (JIRA) <[EMAIL PROTECTED]> wrote: > >> [ >> https://issues.apache.org/jira/browse/FLUME-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479255#comment-13479255] >> >> Mike Percy commented on FLUME-1350: >> ----------------------------------- >> >> Hi Juhani, something like a close-on-idle timeout makes sense. I'd be >> happy to review it if you want to work on it. >> >>> HDFS file handle not closed properly when date bucketing >>> --------------------------------------------------------- >>> >>> Key: FLUME-1350 >>> URL: https://issues.apache.org/jira/browse/FLUME-1350 >>> Project: Flume >>> Issue Type: Bug >>> Components: Sinks+Sources >>> Affects Versions: v1.1.0, v1.2.0 >>> Reporter: Robert Mroczkowski >>> Attachments: HDFSEventSink.java.patch >>> >>> >>> With configuration: >>> agent.sinks.hdfs-cafe-access.type = hdfs >>> agent.sinks.hdfs-cafe-access.hdfs.path >> hdfs://nga/nga/apache/access/%y-%m-%d/ >>> agent.sinks.hdfs-cafe-access.hdfs.fileType = DataStream >>> agent.sinks.hdfs-cafe-access.hdfs.filePrefix = cafe_access >>> agent.sinks.hdfs-cafe-access.hdfs.rollInterval = 21600 >>> agent.sinks.hdfs-cafe-access.hdfs.rollSize = 10485760 >>> agent.sinks.hdfs-cafe-access.hdfs.rollCount = 0 >>> agent.sinks.hdfs-cafe-access.hdfs.txnEventMax = 1000 >>> agent.sinks.hdfs-cafe-access.hdfs.batchSize = 1000 >>> #agent.sinks.hdfs-cafe-access.hdfs.codeC = snappy >>> agent.sinks.hdfs-cafe-access.hdfs.hdfs.maxOpenFiles = 5000 >>> agent.sinks.hdfs-cafe-access.channel = memo-1 >>> When new directory is created previous file handle remains opened. >> rollInterval setting is used only with files in current date bucket. >> >> -- >> This message is automatically generated by JIRA. >> If you think it was sent incorrectly, please contact your JIRA >> administrators >> For more information on JIRA, see: http://www.atlassian.com/software/jira >> |