Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Uncaught Exception When Using Spooling Directory Source


+
Henry Ma 2013-01-18, 04:24
+
Brock Noland 2013-01-18, 04:39
+
Henry Ma 2013-01-18, 05:22
+
Brock Noland 2013-01-18, 05:31
+
Patrick Wendell 2013-01-18, 05:48
+
Henry Ma 2013-01-18, 05:59
+
Mike Percy 2013-01-18, 06:05
+
Henry Ma 2013-01-18, 06:23
Copy link to this message
-
Re: Uncaught Exception When Using Spooling Directory Source
Mike Percy 2013-01-18, 07:45
Can you provide more detail about what kinds of services?

If you roll the logs every 5 minutes or so then you can configure the
spooling source to pick them up once they are rolled by either rolling them
into a directory for immutable files or using the trunk version of the
spooling file source to specify a filter to ignore files that don't match a
"rolled" pattern.

You could also use exec source with "tail -F" but that is much more
unreliable than the spooling file source.

Regards,
Mike
On Thu, Jan 17, 2013 at 10:23 PM, Henry Ma <[EMAIL PROTECTED]> wrote:

> OK, thank you very much, now I know why the problem occurs.
>
> I am a new comer of Flume. Here is my scenario: using Flume to collecting
> from hundreds of directories from dozens of servers to a central storage.
> It seems that spooling directory source may not be the best choice. Can
> someone give me some advice about how to design the architecture? Which
> type of source and sink can fit?
>
> Thanks!
>
>
> On Fri, Jan 18, 2013 at 2:05 PM, Mike Percy <[EMAIL PROTECTED]> wrote:
>
>> Hi Henry,
>> The files must be immutable before putting them into the spooling
>> directory. So if you copy them from a different file system then you can
>> run into this issue. The right way to do it is to copy them to the same
>> file system and then atomically move them into the spooling directory.
>>
>> Regards,
>> Mike
>>
>>
>> On Thu, Jan 17, 2013 at 9:59 PM, Henry Ma <[EMAIL PROTECTED]>wrote:
>>
>>> Thank you very much! I clean all the related dir and restart again. I
>>> keep the source spooling dir empty, then start Flume, and then put some
>>> file into the spooling dir. But this time a new error occured:
>>>
>>> 13/01/18 13:44:24 INFO avro.SpoolingFileLineReader: Preparing to move
>>> file
>>> /disk2/mahy/FLUME_TEST/source/sspstat.log.20130118112700-20130118112800.hs016.ssp
>>> to /disk2/mahy/FLUME_TEST/
>>> source/sspstat.log.20130118112700-20130118112800.hs016.ssp.COMPLETED
>>> 13/01/18 13:44:24 ERROR source.SpoolDirectorySource: Uncaught exception
>>> in Runnable
>>> java.lang.IllegalStateException: File has changed size since being read:
>>> /disk2/mahy/FLUME_TEST/source/sspstat.log.20130118112700-20130118112800.hs016.ssp
>>>         at
>>> org.apache.flume.client.avro.SpoolingFileLineReader.retireCurrentFile(SpoolingFileLineReader.java:241)
>>>         at
>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:185)
>>>         at
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>>>         at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>         at
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>>         at
>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>>         at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>>         at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>>>         at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>         at java.lang.Thread.run(Thread.java:662)
>>> 13/01/18 13:44:24 ERROR source.SpoolDirectorySource: Uncaught exception
>>> in Runnable
>>> java.io.IOException: Stream closed
>>>         at java.io.BufferedReader.ensureOpen(BufferedReader.java:97)
>>>          at java.io.BufferedReader.readLine(BufferedReader.java:292)
>>>         at java.io.BufferedReader.readLine(BufferedReader.java:362)
>>>         at
>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:180)
>>>         at
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
+
Henry Ma 2013-01-18, 08:05
+
Henry Ma 2013-01-18, 08:18
+
Connor Woodson 2013-01-18, 09:13
+
Mike Percy 2013-01-18, 09:26
+
Henry Ma 2013-01-18, 09:32