Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> spooldir source reading Flume itself and thinking the file has changed (1.3.1)


+
Edward Sargisson 2013-05-08, 16:04
+
Mike Percy 2013-05-08, 17:23
+
Edward Sargisson 2013-05-11, 00:02
Copy link to this message
-
Re: spooldir source reading Flume itself and thinking the file has changed (1.3.1)
Hi Edward,
Thanks for investigating. I'm definitely open to suggestions for
improvement with this. Maybe dying is a bit extreme… the goal was to ensure
that people could not possibly try to use it to tail a file, which will
definitely not work correctly! :)

Mike
On Fri, May 10, 2013 at 5:02 PM, Edward Sargisson <[EMAIL PROTECTED]> wrote:

> Hi Mike,
> I was curious so I went on a bit of a hunt through logger source code.
> The result is that loggers can't be relied on to atomically roll the
> file so a feature to allow a delay before checking the file would be
> of great utility.
>
> For that matter, having Flume not die completely in this scenario
> would also be good.
>
> apache-log4j-extras does this [1]:
> return source.renameTo(destination);
>
> logback does this [2]:
> boolean result = srcFile.renameTo(targetFile);
>
> log4j2 does this [3]:
>  srcStream = new FileInputStream(source);
>             destStream = new FileOutputStream(destination);
>             srcChannel = srcStream.getChannel();
>             destChannel = destStream.getChannel();
>             destChannel.transferFrom(srcChannel, 0, srcChannel.size());
>
> The JavaDoc for File.renameTo says:
>  Many aspects of the behavior of this method are inherently
>  platform-dependent: The rename operation might not be able to move a
>  file from one filesystem to another, it might not be atomic, and it
>  might not succeed if a file with the destination abstract pathname
>  already exists.  The return value should always be checked to make sure
>  that the rename operation was successful.
>
> My conclusion is that the loggers (except possibly log4j2) can't be
> relied on to atomically roll the file.
>
> Cheers,
> Edward
>
>
> Links:
> [1]
> http://svn.apache.org/viewvc/logging/log4j/companions/extras/trunk/src/main/java/org/apache/log4j/rolling/helper/FileRenameAction.java?view=markup
> l77
>
> [2]
> https://github.com/qos-ch/logback/blob/master/logback-core/src/main/java/ch/qos/logback/core/rolling/helper/RenameUtil.java
> ,
> l63
> [3]
> https://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/core/src/main/java/org/apache/logging/log4j/core/appender/rolling/helper/FileRenameAction.java
>
>
> >Hi Edward,
> >Spooldir assumes that the file is atomically moved into the directory in
> >question, maybe we should make it wait around until he last mod time is >
> >10 seconds or something to avoid situations where a readdir   + stat
> >catches it during a file copy operation and thinks you're changing the
> >files. If you're sure that the file name is not being reused then maybe
> >it's a bug.
>
> >We require file name to be unique because we don't have a great way to
> read
> >the inode with Java.
>
> >Mike
>
>
> >On Wed, May 8, 2013 at 9:04 AM, Edward Sargisson <[EMAIL PROTECTED]>
> wrote:
>
> > Hi all,
> > I've been experimenting with using Flume to transport Flume's own logs.
> >
> > My setup is that I have the apache-log4j-extras RollingFileAppender
> rolling
> > the file every minute into a spool directory. Then I have Flume with the
> > spooldir source reading that directory.
> >
> > Very often I see:
> > java.lang.IllegalStateException: File name has been re-used with
> different
> > files. Spooling assumpti
> > on violated for
> > /opt/castellan/log/spool/castellan-reader.20130430T2055.log.COMPLETED
> >         at
> >
> >
> org.apache.flume.client.avro.SpoolingFileLineReader.retireCurrentFile(SpoolingFileLineRe
> > ader.java:272)
> >
> > Then it never recovers, repeatedly saying:
> > 2013-05-08 15:57:19,233 ERROR [pool-10-thread-1]
> > org.apache.flume.source.SpoolDirectorySource Uncau
> > ght exception in Runnable
> > java.io.IOException: Stream closed
> >         at java.io.BufferedReader.ensureOpen(BufferedReader.java:115)
> >         at java.io.BufferedReader.readLine(BufferedReader.java:310)
> >         at java.io.BufferedReader.readLine(BufferedReader.java:382)
> >         at
> >
> >
> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:180)
+
Edward Sargisson 2013-05-13, 15:45
+
Edward Sargisson 2013-05-14, 23:50
+
Mike Percy 2013-05-22, 07:18
+
Edward Sargisson 2013-05-22, 16:22
+
Phil Scala 2013-05-23, 00:24
+
Mike Percy 2013-05-23, 01:13
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB