Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Re: [jira] [Commented] (FLUME-1350) HDFS file handle not closed properly when date bucketing


Copy link to this message
-
Re: [jira] [Commented] (FLUME-1350) HDFS file handle not closed properly when date bucketing
Roshan Naik 2012-10-18, 20:13
Will need to handle race conditions like..  a thread resumes writing
immediately after the watcher thread decides to close the file handle. In
that sense a deterministic close is nicer than a timeout based 'garbage
collection'
-roshan
On Thu, Oct 18, 2012 at 12:04 PM, Mike Percy (JIRA) <[EMAIL PROTECTED]> wrote:

>
>     [
> https://issues.apache.org/jira/browse/FLUME-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479255#comment-13479255]
>
> Mike Percy commented on FLUME-1350:
> -----------------------------------
>
> Hi Juhani, something like a close-on-idle timeout makes sense. I'd be
> happy to review it if you want to work on it.
>
> > HDFS file handle not closed properly when date bucketing
> > ---------------------------------------------------------
> >
> >                 Key: FLUME-1350
> >                 URL: https://issues.apache.org/jira/browse/FLUME-1350
> >             Project: Flume
> >          Issue Type: Bug
> >          Components: Sinks+Sources
> >    Affects Versions: v1.1.0, v1.2.0
> >            Reporter: Robert Mroczkowski
> >         Attachments: HDFSEventSink.java.patch
> >
> >
> > With configuration:
> > agent.sinks.hdfs-cafe-access.type = hdfs
> > agent.sinks.hdfs-cafe-access.hdfs.path >  hdfs://nga/nga/apache/access/%y-%m-%d/
> > agent.sinks.hdfs-cafe-access.hdfs.fileType = DataStream
> > agent.sinks.hdfs-cafe-access.hdfs.filePrefix = cafe_access
> > agent.sinks.hdfs-cafe-access.hdfs.rollInterval = 21600
> > agent.sinks.hdfs-cafe-access.hdfs.rollSize = 10485760
> > agent.sinks.hdfs-cafe-access.hdfs.rollCount = 0
> > agent.sinks.hdfs-cafe-access.hdfs.txnEventMax = 1000
> > agent.sinks.hdfs-cafe-access.hdfs.batchSize = 1000
> > #agent.sinks.hdfs-cafe-access.hdfs.codeC = snappy
> > agent.sinks.hdfs-cafe-access.hdfs.hdfs.maxOpenFiles = 5000
> > agent.sinks.hdfs-cafe-access.channel = memo-1
> > When new directory is created previous file handle remains opened.
> rollInterval setting is used only with files in current date bucket.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>