Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> .SpoolingFileLineReader warning....


+
Dan Young 2012-11-17, 15:02
+
Brock Noland 2012-11-17, 15:57
+
Dan Young 2012-11-17, 16:01
+
Brock Noland 2012-11-17, 16:15
+
Dan Young 2012-11-17, 16:33
+
Dan Young 2012-11-19, 19:33
+
Patrick Wendell 2012-11-19, 23:03
+
Patrick Wendell 2012-11-19, 23:04
Copy link to this message
-
Re: .SpoolingFileLineReader warning....
My guess is that the file does not have the correct permissions while
being copied.

[noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al test*
[1] 18780
-rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0
-rw------- 1 noland noland   52334592 Nov 19 17:27 test-1
For large files, it probably makes sense to copy the file in as .file
and then rename it to file.

Brock

On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote:
> The spooling source gets a directory listing, then reads each file, then
> renames it to X.COMPLETED. Is it possible some other process deleted that
> file between when Flume listed the directory and when it tried to open the
> file? Otherwise, I'm confused why the file would not be present in the
> listing you give here.
>
>
> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote:
>>
>> Hey Dan,
>>
>> You say that it seems like Flume has already processed the log... why do
>> you think that?
>>
>> When you listed the directory contents I don't see the original or the
>> COMPLETED version of the file that Flume is complaining about:
>>
>> /clickstream.log-2012-11-17-1353163623
>>
>> doesn't appear in the
>>
>> /mnt/flume/clickstream/
>>
>> directory listing anywhere.
>>
>>
>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> wrote:
>>>
>>> Hello Brock,
>>>
>>> It seems like we get this message each time that logrotate runs and is in
>>> the process of copying the file to the SpoolingDirectory. It seems that
>>> Flume starts reading the file as soon as it shows up in the
>>> SpoolingDirectory.....  Maybe it's trying to read the file while it's still
>>> being written to????
>>>
>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN -
>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)]
>>> Could not find file:
>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239
>>> java.io.FileNotFoundException:
>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission
>>> denied)
>>> at java.io.FileInputStream.open(Native Method)
>>> at java.io.FileInputStream.<init>(FileInputStream.java:138)
>>> at java.io.FileReader.<init>(FileReader.java:72)
>>> at
>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322)
>>> at
>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172)
>>> at
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> at
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>> at java.lang.Thread.run(Thread.java:722)
>>>
>>>
>>>
>>>
>>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Ok, do you mind sharing your log rotate config to see if we can
>>>> reproduce?
>>>>
>>>> --
>>>> Brock Noland
>>>> Sent with Sparrow
>>>>
>>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote:
>>>>
>>>> Hey Brock,
>>>>
>>>> No I have not modified the conf while the agent was running.
>>>>
>>>> /mnt/flume is local. Note that this is running on an ec2 instance and
>>>> the disk is the ephemeral drive, not EBS.
>>>>
>>>> Regards ,
>>>>
>>>> Dano
>>>>
>>>> On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I highly doubt it's related to
>>>> (https://issues.apache.org/jira/browse/FLUME-1721) but have you

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
+
Brock Noland 2012-11-20, 12:25
+
Dan Young 2012-11-20, 15:02
+
Brock Noland 2012-11-20, 16:21
+
Dan Young 2012-11-20, 16:59
+
Brock Noland 2012-11-20, 17:01
+
Dan Young 2012-11-20, 17:10
+
Brock Noland 2012-11-20, 17:14
+
Dan Young 2012-11-20, 17:17
+
Dan Young 2012-11-20, 20:03
+
Brock Noland 2012-11-20, 20:06
+
Patrick Wendell 2012-11-23, 12:46