Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> .SpoolingFileLineReader warning....


+
Dan Young 2012-11-17, 15:02
+
Brock Noland 2012-11-17, 15:57
+
Dan Young 2012-11-17, 16:01
+
Brock Noland 2012-11-17, 16:15
+
Dan Young 2012-11-17, 16:33
+
Dan Young 2012-11-19, 19:33
+
Patrick Wendell 2012-11-19, 23:03
+
Patrick Wendell 2012-11-19, 23:04
+
Brock Noland 2012-11-19, 23:29
+
Brock Noland 2012-11-20, 12:25
Copy link to this message
-
Re: .SpoolingFileLineReader warning....
Hey Brock,

I can do some more testing on my side with smaller files as well as doing a
mv vs a cp . I do believe that a slight delay would be helpful since people
will be moving/copying large files around.

Regards ,

Dano
On Nov 20, 2012 5:26 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote:

> Thinking about this more, I think it's probably going to be quite
> common for people to cp large files into the spooling directory.
> Patrick, what do you think about waiting until the mtime is say 1
> second old?
>
> Brock
>
> On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
> > My guess is that the file does not have the correct permissions while
> > being copied.
> >
> > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al
> test*
> > [1] 18780
> > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0
> > -rw------- 1 noland noland   52334592 Nov 19 17:27 test-1
> >
> >
> > For large files, it probably makes sense to copy the file in as .file
> > and then rename it to file.
> >
> > Brock
> >
> > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]>
> wrote:
> >> The spooling source gets a directory listing, then reads each file, then
> >> renames it to X.COMPLETED. Is it possible some other process deleted
> that
> >> file between when Flume listed the directory and when it tried to open
> the
> >> file? Otherwise, I'm confused why the file would not be present in the
> >> listing you give here.
> >>
> >>
> >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>> Hey Dan,
> >>>
> >>> You say that it seems like Flume has already processed the log... why
> do
> >>> you think that?
> >>>
> >>> When you listed the directory contents I don't see the original or the
> >>> COMPLETED version of the file that Flume is complaining about:
> >>>
> >>> /clickstream.log-2012-11-17-1353163623
> >>>
> >>> doesn't appear in the
> >>>
> >>> /mnt/flume/clickstream/
> >>>
> >>> directory listing anywhere.
> >>>
> >>>
> >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]>
> wrote:
> >>>>
> >>>> Hello Brock,
> >>>>
> >>>> It seems like we get this message each time that logrotate runs and
> is in
> >>>> the process of copying the file to the SpoolingDirectory. It seems
> that
> >>>> Flume starts reading the file as soon as it shows up in the
> >>>> SpoolingDirectory.....  Maybe it's trying to read the file while it's
> still
> >>>> being written to????
> >>>>
> >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN -
> >>>>
> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)]
> >>>> Could not find file:
> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
> >>>> java.io.FileNotFoundException:
> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
> (Permission
> >>>> denied)
> >>>> at java.io.FileInputStream.open(Native Method)
> >>>> at java.io.FileInputStream.<init>(FileInputStream.java:138)
> >>>> at java.io.FileReader.<init>(FileReader.java:72)
> >>>> at
> >>>>
> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
> >>>> at
> >>>>
> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
> >>>> at
> >>>>
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
> >>>> at
> >>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>> at
> >>>>
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> >>>> at
> >>>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> >>>> at
> >>>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> >>>> at
> >>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
+
Brock Noland 2012-11-20, 16:21
+
Dan Young 2012-11-20, 16:59
+
Brock Noland 2012-11-20, 17:01
+
Dan Young 2012-11-20, 17:10
+
Brock Noland 2012-11-20, 17:14
+
Dan Young 2012-11-20, 17:17
+
Dan Young 2012-11-20, 20:03
+
Brock Noland 2012-11-20, 20:06
+
Patrick Wendell 2012-11-23, 12:46
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB