|
Dan Young
2012-11-17, 15:02
Brock Noland
2012-11-17, 15:57
Dan Young
2012-11-17, 16:01
Brock Noland
2012-11-17, 16:15
Dan Young
2012-11-17, 16:33
Dan Young
2012-11-19, 19:33
Patrick Wendell
2012-11-19, 23:03
Patrick Wendell
2012-11-19, 23:04
Brock Noland
2012-11-19, 23:29
Brock Noland
2012-11-20, 12:25
Dan Young
2012-11-20, 15:02
Brock Noland
2012-11-20, 16:21
Dan Young
2012-11-20, 16:59
Brock Noland
2012-11-20, 17:01
Dan Young
2012-11-20, 17:10
Brock Noland
2012-11-20, 17:14
Dan Young
2012-11-20, 17:17
Dan Young
2012-11-20, 20:03
Brock Noland
2012-11-20, 20:06
Patrick Wendell
2012-11-23, 12:46
|
-
.SpoolingFileLineReader warning....Dan Young 2012-11-17, 15:02
First a bit of context, I'm using logrotate to monitor and copy (cp -p) log
files to a flume spooling directory source. So every hour, logrotate checks for and copies a file from the source to the flume destination. I see the following warning message in the flume logs. 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - Could not find file: /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 java.io.FileNotFoundException: /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission denied) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileReader.<init>(FileReader.java:72) at org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) at org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Although it appears that Flume processes the log, I'm curious why I''m seeing this and if I have anything with permissions incorrect? Here's the permissions: source log directory under /var/log: drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:47 clickstream source files: -rw-rw-r-- 1 ubuntu ubuntu 9055750 Nov 17 13:29 clickstream.log-2012-11-17-1353158953.gz -rw-rw-r-- 1 ubuntu ubuntu 13583565 Nov 17 14:17 clickstream.log-2012-11-17-1353161821.gz -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 clickstream.log-2012-11-17-1353163623 -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 17 14:52 clickstream.log flume source directory under /mnt/flume: drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream flume source files: -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29 clickstream.log-2012-11-17-1353158953.COMPLETED -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17 clickstream.log-2012-11-17-1353161821.COMPLETED -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 clickstream.log-2012-11-17-1353163623.COMPLETED Any insight would be appreciated. Regards, Dan +
Dan Young 2012-11-17, 15:02
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-17, 15:57
Hi,
I highly doubt it's related to (https://issues.apache.org/jira/browse/FLUME-1721) but have you modified the configuration file since starting the agent? If so, can you restart the agent and see if the error continues? Also, is /mnt/flume local disk or NAS? Brock On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > First a bit of context, I'm using logrotate to monitor and copy (cp -p) log > files to a flume spooling directory source. So every hour, logrotate checks > for and copies a file from the source to the flume destination. I see the > following warning message in the flume logs. > > > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > Could not find file: > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > java.io.FileNotFoundException: > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission > denied) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:138) > at java.io.FileReader.<init>(FileReader.java:72) > at > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > at > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > at > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > > > Although it appears that Flume processes the log, I'm curious why I''m > seeing this and if I have anything with permissions incorrect? > > > > Here's the permissions: > > source log directory under /var/log: > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:47 clickstream > > source files: > -rw-rw-r-- 1 ubuntu ubuntu 9055750 Nov 17 13:29 > clickstream.log-2012-11-17-1353158953.gz > -rw-rw-r-- 1 ubuntu ubuntu 13583565 Nov 17 14:17 > clickstream.log-2012-11-17-1353161821.gz > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > clickstream.log-2012-11-17-1353163623 > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 17 14:52 clickstream.log > > flume source directory under /mnt/flume: > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream > > flume source files: > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29 > clickstream.log-2012-11-17-1353158953.COMPLETED > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17 > clickstream.log-2012-11-17-1353161821.COMPLETED > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > clickstream.log-2012-11-17-1353163623.COMPLETED > > Any insight would be appreciated. > > Regards, > > Dan -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-17, 15:57
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-17, 16:01
Hey Brock,
No I have not modified the conf while the agent was running. /mnt/flume is local. Note that this is running on an ec2 instance and the disk is the ephemeral drive, not EBS. Regards , Dano On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: > Hi, > > I highly doubt it's related to > (https://issues.apache.org/jira/browse/FLUME-1721) but have you > modified the configuration file since starting the agent? If so, can > you restart the agent and see if the error continues? > > Also, is /mnt/flume local disk or NAS? > > Brock > > On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > First a bit of context, I'm using logrotate to monitor and copy (cp -p) > log > > files to a flume spooling directory source. So every hour, logrotate > checks > > for and copies a file from the source to the flume destination. I see the > > following warning message in the flume logs. > > > > > > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > > Could not find file: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > > java.io.FileNotFoundException: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission > > denied) > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:138) > > at java.io.FileReader.<init>(FileReader.java:72) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > > at > > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at > > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > at java.lang.Thread.run(Thread.java:722) > > > > > > Although it appears that Flume processes the log, I'm curious why I''m > > seeing this and if I have anything with permissions incorrect? > > > > > > > > Here's the permissions: > > > > source log directory under /var/log: > > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:47 clickstream > > > > source files: > > -rw-rw-r-- 1 ubuntu ubuntu 9055750 Nov 17 13:29 > > clickstream.log-2012-11-17-1353158953.gz > > -rw-rw-r-- 1 ubuntu ubuntu 13583565 Nov 17 14:17 > > clickstream.log-2012-11-17-1353161821.gz > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > > clickstream.log-2012-11-17-1353163623 > > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 17 14:52 clickstream.log > > > > flume source directory under /mnt/flume: > > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream > > > > flume source files: > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29 > > clickstream.log-2012-11-17-1353158953.COMPLETED > > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17 > > clickstream.log-2012-11-17-1353161821.COMPLETED > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > > clickstream.log-2012-11-17-1353163623.COMPLETED > > > > Any insight would be appreciated. > > > > Regards, > > > > Dan > > > > -- > Apache MRUnit - Unit testing MapReduce - > http://incubator.apache.org/mrunit/ > +
Dan Young 2012-11-17, 16:01
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-17, 16:15
Ok, do you mind sharing your log rotate config to see if we can reproduce?
-- Brock Noland Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: > Hey Brock, > No I have not modified the conf while the agent was running. > /mnt/flume is local. Note that this is running on an ec2 instance and the disk is the ephemeral drive, not EBS. > Regards , > Dano > On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > Hi, > > > > I highly doubt it's related to > > (https://issues.apache.org/jira/browse/FLUME-1721) but have you > > modified the configuration file since starting the agent? If so, can > > you restart the agent and see if the error continues? > > > > Also, is /mnt/flume local disk or NAS? > > > > Brock > > > > On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > > First a bit of context, I'm using logrotate to monitor and copy (cp -p) log > > > files to a flume spooling directory source. So every hour, logrotate checks > > > for and copies a file from the source to the flume destination. I see the > > > following warning message in the flume logs. > > > > > > > > > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > > > Could not find file: > > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > > > java.io.FileNotFoundException: > > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission > > > denied) > > > at java.io.FileInputStream.open(Native Method) > > > at java.io.FileInputStream.<init>(FileInputStream.java:138) > > > at java.io.FileReader.<init>(FileReader.java:72) > > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > > > at > > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > > > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > > at > > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > Although it appears that Flume processes the log, I'm curious why I''m > > > seeing this and if I have anything with permissions incorrect? > > > > > > > > > > > > Here's the permissions: > > > > > > source log directory under /var/log: > > > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:47 clickstream > > > > > > source files: > > > -rw-rw-r-- 1 ubuntu ubuntu 9055750 Nov 17 13:29 > > > clickstream.log-2012-11-17-1353158953.gz > > > -rw-rw-r-- 1 ubuntu ubuntu 13583565 Nov 17 14:17 > > > clickstream.log-2012-11-17-1353161821.gz > > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 > > > clickstream.log-2012-11-17-1353163623 > > > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 17 14:52 clickstream.log > > > > > > flume source directory under /mnt/flume: > > > drwxrwxr-x 2 ubuntu ubuntu 4096 Nov 17 14:48 clickstream > > > > > > flume source files: > > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 13:29 > > > clickstream.log-2012-11-17-1353158953.COMPLETED > > > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 17 14:17 > > > clickstream.log-2012-11-17-1353161821.COMPLETED > > > -rw-rw-r-- 1 ubuntu ubuntu 131296672 Nov 17 14:47 +
Brock Noland 2012-11-17, 16:15
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-17, 16:33
logrotate config, in /etc/logrotate.d, ran from cron.hourly.
/var/log/clickstream/clickstream.log { missingok rotate 3 compress delaycompress copytruncate notifempty size 50M dateext dateformat -%Y-%m-%d-%s create 666 ubuntu ubuntu postrotate cp -p $1 /mnt/flume/clickstream/ 2>&1 endscript } flume config: # Name the components on this agent agent1.sources = c1 agent1.sinks = c1s3 agent1.channels = ch1 # Describe/configure agent1.sources.c1.type = org.apache.flume.source.SpoolDirectorySource agent1.sources.c1.spoolDir = /mnt/flume/clickstream agent1.sources.c1.fileHeader = false agent1.sources.c1.interceptors = a b agent1.sources.c1.interceptors.a.type org.apache.flume.interceptor.TimestampInterceptor$Builder agent1.sources.c1.interceptors.b.type org.apache.flume.interceptor.HostInterceptor$Builder agent1.sources.c1.interceptors.b.preserveExisting = false agent1.sources.c1.interceptors.b.hostHeader = host # Describe s3 agent1.sinks.c1s3.type = hdfs agent1.sinks.c1s3.hdfs.path s3n://<super_secret_stuff_here>@<my_bucket>/clicks/%Y/%m agent1.sinks.c1s3.hdfs.rollInterval = 300 agent1.sinks.c1s3.hdfs.rollSize = 0 agent1.sinks.c1s3.hdfs.rollCount = 0 agent1.sinks.c1s3.hdfs.batchSize = 400000 agent1.sinks.c1s3.hdfs.codeC = gzip agent1.sinks.c1s3.hdfs.fileType = CompressedStream agent1.sinks.c1s3.hdfs.writeFormat = Text agent1.sinks.c1s3.hdfs.filePrefix = clicks-%Y-%m-%d-%H-%M-%{host}- agent1.sinks.c1s3.hdfs.round = true agent1.sinks.c1s3.hdfs.roundValue = 10 agent1.sinks.c1s3.hdfs.roundUnit = minute # Use a channel which buffers events in memory agent1.channels.ch1.type = file agent1.channels.ch1.transactionCapacity = 400000 agent1.channels.ch1.capacity = 2000000 agent1.channels.ch1.checkpointDir = /mnt/flume/.flume/file-ch1/checkpoint agent1.channels.ch1.dataDirs = /mnt/flume/.flume/file-ch1/data agent1.channels.ch1.checkpointInterval = 30000 # Bind the source and sink to the channel agent1.sources.c1.channels = ch1 agent1.sinks.c1s3.channel = ch1 On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > Ok, do you mind sharing your log rotate config to see if we can > reproduce? > > -- > Brock Noland > Sent with Sparrow <http://www.sparrowmailapp.com/?sig> > > On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: > > Hey Brock, > > No I have not modified the conf while the agent was running. > > /mnt/flume is local. Note that this is running on an ec2 instance and the > disk is the ephemeral drive, not EBS. > > Regards , > > Dano > On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: > > Hi, > > I highly doubt it's related to > (https://issues.apache.org/jira/browse/FLUME-1721) but have you > modified the configuration file since starting the agent? If so, can > you restart the agent and see if the error continues? > > Also, is /mnt/flume local disk or NAS? > > Brock > > On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > First a bit of context, I'm using logrotate to monitor and copy (cp -p) > log > > files to a flume spooling directory source. So every hour, logrotate > checks > > for and copies a file from the source to the flume destination. I see the > > following warning message in the flume logs. > > > > > > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > > Could not find file: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > > java.io.FileNotFoundException: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission > > denied) > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:138) > > at java.io.FileReader.<init>(FileReader.java:72) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) +
Dan Young 2012-11-17, 16:33
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-19, 19:33
Hello Brock,
It seems like we get this message each time that logrotate runs and is in the process of copying the file to the SpoolingDirectory. It seems that Flume starts reading the file as soon as it shows up in the SpoolingDirectory..... Maybe it's trying to read the file while it's still being written to???? 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] Could not find file: /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 java.io.FileNotFoundException: /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission denied) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileReader.<init>(FileReader.java:72) at org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) at org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > Ok, do you mind sharing your log rotate config to see if we can > reproduce? > > -- > Brock Noland > Sent with Sparrow <http://www.sparrowmailapp.com/?sig> > > On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: > > Hey Brock, > > No I have not modified the conf while the agent was running. > > /mnt/flume is local. Note that this is running on an ec2 instance and the > disk is the ephemeral drive, not EBS. > > Regards , > > Dano > On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: > > Hi, > > I highly doubt it's related to > (https://issues.apache.org/jira/browse/FLUME-1721) but have you > modified the configuration file since starting the agent? If so, can > you restart the agent and see if the error continues? > > Also, is /mnt/flume local disk or NAS? > > Brock > > On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > First a bit of context, I'm using logrotate to monitor and copy (cp -p) > log > > files to a flume spooling directory source. So every hour, logrotate > checks > > for and copies a file from the source to the flume destination. I see the > > following warning message in the flume logs. > > > > > > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > > Could not find file: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 > > java.io.FileNotFoundException: > > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission > > denied) > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:138) > > at java.io.FileReader.<init>(FileReader.java:72) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > > at > > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at > > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) +
Dan Young 2012-11-19, 19:33
-
Re: .SpoolingFileLineReader warning....Patrick Wendell 2012-11-19, 23:03
Hey Dan,
You say that it seems like Flume has already processed the log... why do you think that? When you listed the directory contents I don't see the original or the COMPLETED version of the file that Flume is complaining about: /clickstream.log-2012-11-17-1353163623 doesn't appear in the /mnt/flume/clickstream/ directory listing anywhere. On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> wrote: > Hello Brock, > > It seems like we get this message each time that logrotate runs and is in > the process of copying the file to the SpoolingDirectory. It seems that > Flume starts reading the file as soon as it shows up in the > SpoolingDirectory..... Maybe it's trying to read the file while it's still > being written to???? > > 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] > Could not find file: > /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 > java.io.FileNotFoundException: > /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission > denied) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:138) > at java.io.FileReader.<init>(FileReader.java:72) > at > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > at > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > at > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > > > > > On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > >> Ok, do you mind sharing your log rotate config to see if we can >> reproduce? >> >> -- >> Brock Noland >> Sent with Sparrow <http://www.sparrowmailapp.com/?sig> >> >> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: >> >> Hey Brock, >> >> No I have not modified the conf while the agent was running. >> >> /mnt/flume is local. Note that this is running on an ec2 instance and the >> disk is the ephemeral drive, not EBS. >> >> Regards , >> >> Dano >> On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: >> >> Hi, >> >> I highly doubt it's related to >> (https://issues.apache.org/jira/browse/FLUME-1721) but have you >> modified the configuration file since starting the agent? If so, can >> you restart the agent and see if the error continues? >> >> Also, is /mnt/flume local disk or NAS? >> >> Brock >> >> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: >> > First a bit of context, I'm using logrotate to monitor and copy (cp -p) >> log >> > files to a flume spooling directory source. So every hour, logrotate >> checks >> > for and copies a file from the source to the flume destination. I see >> the >> > following warning message in the flume logs. >> > >> > >> > 17 Nov 2012 14:47:07,682 WARN [pool-10-thread-1] >> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - >> > Could not find file: >> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 >> > java.io.FileNotFoundException: >> > /mnt/flume/clickstream/clickstream.log-2012-11-17-1353163623 (Permission >> > denied) >> > at java.io.FileInputStream.open(Native Method) +
Patrick Wendell 2012-11-19, 23:03
-
Re: .SpoolingFileLineReader warning....Patrick Wendell 2012-11-19, 23:04
The spooling source gets a directory listing, then reads each file, then
renames it to X.COMPLETED. Is it possible some other process deleted that file between when Flume listed the directory and when it tried to open the file? Otherwise, I'm confused why the file would not be present in the listing you give here. On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: > Hey Dan, > > You say that it seems like Flume has already processed the log... why do > you think that? > > When you listed the directory contents I don't see the original or the > COMPLETED version of the file that Flume is complaining about: > > /clickstream.log-2012-11-17-1353163623 > > doesn't appear in the > > /mnt/flume/clickstream/ > > directory listing anywhere. > > > On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> wrote: > >> Hello Brock, >> >> It seems like we get this message each time that logrotate runs and is in >> the process of copying the file to the SpoolingDirectory. It seems that >> Flume starts reading the file as soon as it shows up in the >> SpoolingDirectory..... Maybe it's trying to read the file while it's still >> being written to???? >> >> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - >> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] >> Could not find file: >> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 >> java.io.FileNotFoundException: >> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission >> denied) >> at java.io.FileInputStream.open(Native Method) >> at java.io.FileInputStream.<init>(FileInputStream.java:138) >> at java.io.FileReader.<init>(FileReader.java:72) >> at >> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) >> at >> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) >> at >> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) >> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> at >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >> at java.lang.Thread.run(Thread.java:722) >> >> >> >> >> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >> >>> Ok, do you mind sharing your log rotate config to see if we can >>> reproduce? >>> >>> -- >>> Brock Noland >>> Sent with Sparrow <http://www.sparrowmailapp.com/?sig> >>> >>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: >>> >>> Hey Brock, >>> >>> No I have not modified the conf while the agent was running. >>> >>> /mnt/flume is local. Note that this is running on an ec2 instance and >>> the disk is the ephemeral drive, not EBS. >>> >>> Regards , >>> >>> Dano >>> On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: >>> >>> Hi, >>> >>> I highly doubt it's related to >>> (https://issues.apache.org/jira/browse/FLUME-1721) but have you >>> modified the configuration file since starting the agent? If so, can >>> you restart the agent and see if the error continues? >>> >>> Also, is /mnt/flume local disk or NAS? >>> >>> Brock >>> >>> On Sat, Nov 17, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: >>> > First a bit of context, I'm using logrotate to monitor and copy (cp >>> -p) log >>> > files to a flume spooling directory source. So every hour, logrotate >>> checks >>> > for and copies a file from the source to the flume destination. I see +
Patrick Wendell 2012-11-19, 23:04
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-19, 23:29
My guess is that the file does not have the correct permissions while
being copied. [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al test* [1] 18780 -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0 -rw------- 1 noland noland 52334592 Nov 19 17:27 test-1 For large files, it probably makes sense to copy the file in as .file and then rename it to file. Brock On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: > The spooling source gets a directory listing, then reads each file, then > renames it to X.COMPLETED. Is it possible some other process deleted that > file between when Flume listed the directory and when it tried to open the > file? Otherwise, I'm confused why the file would not be present in the > listing you give here. > > > On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: >> >> Hey Dan, >> >> You say that it seems like Flume has already processed the log... why do >> you think that? >> >> When you listed the directory contents I don't see the original or the >> COMPLETED version of the file that Flume is complaining about: >> >> /clickstream.log-2012-11-17-1353163623 >> >> doesn't appear in the >> >> /mnt/flume/clickstream/ >> >> directory listing anywhere. >> >> >> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> wrote: >>> >>> Hello Brock, >>> >>> It seems like we get this message each time that logrotate runs and is in >>> the process of copying the file to the SpoolingDirectory. It seems that >>> Flume starts reading the file as soon as it shows up in the >>> SpoolingDirectory..... Maybe it's trying to read the file while it's still >>> being written to???? >>> >>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - >>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] >>> Could not find file: >>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 >>> java.io.FileNotFoundException: >>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission >>> denied) >>> at java.io.FileInputStream.open(Native Method) >>> at java.io.FileInputStream.<init>(FileInputStream.java:138) >>> at java.io.FileReader.<init>(FileReader.java:72) >>> at >>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) >>> at >>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) >>> at >>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at >>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >>> at java.lang.Thread.run(Thread.java:722) >>> >>> >>> >>> >>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >>>> >>>> Ok, do you mind sharing your log rotate config to see if we can >>>> reproduce? >>>> >>>> -- >>>> Brock Noland >>>> Sent with Sparrow >>>> >>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: >>>> >>>> Hey Brock, >>>> >>>> No I have not modified the conf while the agent was running. >>>> >>>> /mnt/flume is local. Note that this is running on an ec2 instance and >>>> the disk is the ephemeral drive, not EBS. >>>> >>>> Regards , >>>> >>>> Dano >>>> >>>> On Nov 17, 2012 8:58 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: >>>> >>>> Hi, >>>> >>>> I highly doubt it's related to >>>> (https://issues.apache.org/jira/browse/FLUME-1721) but have you Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-19, 23:29
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-20, 12:25
Thinking about this more, I think it's probably going to be quite
common for people to cp large files into the spooling directory. Patrick, what do you think about waiting until the mtime is say 1 second old? Brock On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland <[EMAIL PROTECTED]> wrote: > My guess is that the file does not have the correct permissions while > being copied. > > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al test* > [1] 18780 > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0 > -rw------- 1 noland noland 52334592 Nov 19 17:27 test-1 > > > For large files, it probably makes sense to copy the file in as .file > and then rename it to file. > > Brock > > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: >> The spooling source gets a directory listing, then reads each file, then >> renames it to X.COMPLETED. Is it possible some other process deleted that >> file between when Flume listed the directory and when it tried to open the >> file? Otherwise, I'm confused why the file would not be present in the >> listing you give here. >> >> >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: >>> >>> Hey Dan, >>> >>> You say that it seems like Flume has already processed the log... why do >>> you think that? >>> >>> When you listed the directory contents I don't see the original or the >>> COMPLETED version of the file that Flume is complaining about: >>> >>> /clickstream.log-2012-11-17-1353163623 >>> >>> doesn't appear in the >>> >>> /mnt/flume/clickstream/ >>> >>> directory listing anywhere. >>> >>> >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> wrote: >>>> >>>> Hello Brock, >>>> >>>> It seems like we get this message each time that logrotate runs and is in >>>> the process of copying the file to the SpoolingDirectory. It seems that >>>> Flume starts reading the file as soon as it shows up in the >>>> SpoolingDirectory..... Maybe it's trying to read the file while it's still >>>> being written to???? >>>> >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] >>>> Could not find file: >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 >>>> java.io.FileNotFoundException: >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 (Permission >>>> denied) >>>> at java.io.FileInputStream.open(Native Method) >>>> at java.io.FileInputStream.<init>(FileInputStream.java:138) >>>> at java.io.FileReader.<init>(FileReader.java:72) >>>> at >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) >>>> at >>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) >>>> at >>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> at >>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >>>> at java.lang.Thread.run(Thread.java:722) >>>> >>>> >>>> >>>> >>>> On Sat, Nov 17, 2012 at 9:15 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >>>>> >>>>> Ok, do you mind sharing your log rotate config to see if we can >>>>> reproduce? >>>>> >>>>> -- >>>>> Brock Noland >>>>> Sent with Sparrow >>>>> >>>>> On Saturday, November 17, 2012 at 10:01 AM, Dan Young wrote: Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-20, 12:25
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-20, 15:02
Hey Brock,
I can do some more testing on my side with smaller files as well as doing a mv vs a cp . I do believe that a slight delay would be helpful since people will be moving/copying large files around. Regards , Dano On Nov 20, 2012 5:26 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: > Thinking about this more, I think it's probably going to be quite > common for people to cp large files into the spooling directory. > Patrick, what do you think about waiting until the mtime is say 1 > second old? > > Brock > > On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland <[EMAIL PROTECTED]> wrote: > > My guess is that the file does not have the correct permissions while > > being copied. > > > > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al > test* > > [1] 18780 > > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0 > > -rw------- 1 noland noland 52334592 Nov 19 17:27 test-1 > > > > > > For large files, it probably makes sense to copy the file in as .file > > and then rename it to file. > > > > Brock > > > > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]> > wrote: > >> The spooling source gets a directory listing, then reads each file, then > >> renames it to X.COMPLETED. Is it possible some other process deleted > that > >> file between when Flume listed the directory and when it tried to open > the > >> file? Otherwise, I'm confused why the file would not be present in the > >> listing you give here. > >> > >> > >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> > wrote: > >>> > >>> Hey Dan, > >>> > >>> You say that it seems like Flume has already processed the log... why > do > >>> you think that? > >>> > >>> When you listed the directory contents I don't see the original or the > >>> COMPLETED version of the file that Flume is complaining about: > >>> > >>> /clickstream.log-2012-11-17-1353163623 > >>> > >>> doesn't appear in the > >>> > >>> /mnt/flume/clickstream/ > >>> > >>> directory listing anywhere. > >>> > >>> > >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> > wrote: > >>>> > >>>> Hello Brock, > >>>> > >>>> It seems like we get this message each time that logrotate runs and > is in > >>>> the process of copying the file to the SpoolingDirectory. It seems > that > >>>> Flume starts reading the file as soon as it shows up in the > >>>> SpoolingDirectory..... Maybe it's trying to read the file while it's > still > >>>> being written to???? > >>>> > >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - > >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] > >>>> Could not find file: > >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 > >>>> java.io.FileNotFoundException: > >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 > (Permission > >>>> denied) > >>>> at java.io.FileInputStream.open(Native Method) > >>>> at java.io.FileInputStream.<init>(FileInputStream.java:138) > >>>> at java.io.FileReader.<init>(FileReader.java:72) > >>>> at > >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > >>>> at > >>>> > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > >>>> at > >>>> > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > >>>> at > >>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > >>>> at > >>>> > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > >>>> at > >>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > >>>> at > >>>> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > >>>> at > >>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) +
Dan Young 2012-11-20, 15:02
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-20, 16:21
Yeah I think that makes sense, I have created a JIRA for this
https://issues.apache.org/jira/browse/FLUME-1733 Brock On Tue, Nov 20, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > Hey Brock, > > I can do some more testing on my side with smaller files as well as doing a > mv vs a cp . I do believe that a slight delay would be helpful since people > will be moving/copying large files around. > > Regards , > > Dano > > On Nov 20, 2012 5:26 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: >> >> Thinking about this more, I think it's probably going to be quite >> common for people to cp large files into the spooling directory. >> Patrick, what do you think about waiting until the mtime is say 1 >> second old? >> >> Brock >> >> On Mon, Nov 19, 2012 at 5:29 PM, Brock Noland <[EMAIL PROTECTED]> wrote: >> > My guess is that the file does not have the correct permissions while >> > being copied. >> > >> > [noland@localhost cp-test]$ cp -p test-0 test-1 & sleep 0.1; ls -al >> > test* >> > [1] 18780 >> > -rw-rw-r-- 1 noland noland 1048576000 Nov 19 17:25 test-0 >> > -rw------- 1 noland noland 52334592 Nov 19 17:27 test-1 >> > >> > >> > For large files, it probably makes sense to copy the file in as .file >> > and then rename it to file. >> > >> > Brock >> > >> > On Mon, Nov 19, 2012 at 5:04 PM, Patrick Wendell <[EMAIL PROTECTED]> >> > wrote: >> >> The spooling source gets a directory listing, then reads each file, >> >> then >> >> renames it to X.COMPLETED. Is it possible some other process deleted >> >> that >> >> file between when Flume listed the directory and when it tried to open >> >> the >> >> file? Otherwise, I'm confused why the file would not be present in the >> >> listing you give here. >> >> >> >> >> >> On Mon, Nov 19, 2012 at 6:03 PM, Patrick Wendell <[EMAIL PROTECTED]> >> >> wrote: >> >>> >> >>> Hey Dan, >> >>> >> >>> You say that it seems like Flume has already processed the log... why >> >>> do >> >>> you think that? >> >>> >> >>> When you listed the directory contents I don't see the original or the >> >>> COMPLETED version of the file that Flume is complaining about: >> >>> >> >>> /clickstream.log-2012-11-17-1353163623 >> >>> >> >>> doesn't appear in the >> >>> >> >>> /mnt/flume/clickstream/ >> >>> >> >>> directory listing anywhere. >> >>> >> >>> >> >>> On Mon, Nov 19, 2012 at 2:33 PM, Dan Young <[EMAIL PROTECTED]> >> >>> wrote: >> >>>> >> >>>> Hello Brock, >> >>>> >> >>>> It seems like we get this message each time that logrotate runs and >> >>>> is in >> >>>> the process of copying the file to the SpoolingDirectory. It seems >> >>>> that >> >>>> Flume starts reading the file as soon as it shows up in the >> >>>> SpoolingDirectory..... Maybe it's trying to read the file while it's >> >>>> still >> >>>> being written to???? >> >>>> >> >>>> 2012-11-19 19:27:27,924 (pool-12-thread-1) [WARN - >> >>>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:328)] >> >>>> Could not find file: >> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 >> >>>> java.io.FileNotFoundException: >> >>>> /mnt/flume/clickstream2/clickstream2.log-2012-11-19-1353353239 >> >>>> (Permission >> >>>> denied) >> >>>> at java.io.FileInputStream.open(Native Method) >> >>>> at java.io.FileInputStream.<init>(FileInputStream.java:138) >> >>>> at java.io.FileReader.<init>(FileReader.java:72) >> >>>> at >> >>>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) >> >>>> at >> >>>> >> >>>> org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) >> >>>> at >> >>>> >> >>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) >> >>>> at >> >>>> >> >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> >>>> at >> >>>> >> >>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >> >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-20, 16:21
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-20, 16:59
Coolio, thank you Brock.
Did a quick test with a mv vs cp.....this is one test, repeated a few times....the smaller file (~60M) seemed to work fine w/ a mv, but a larger file we're seeing the same behavior.... Start with no logs in the SpoolingDirectory: ls -lrt /mnt/flume/clickstream /mnt/flume/clickstream: total 0 Review the logs that will be rotated via logrotate.d into the respective directory; /mnt/flume/clickstream ls -lrt /var/log/clickstream /var/log/clickstream: total 64112 -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log Review logrotate config in /etc/logrotate.d. Note here, I changed from cp -p to a mv..... /var/log/clickstream/clickstream.log { missingok rotate 3 compress delaycompress copytruncate notifempty size 50M dateext dateformat -%Y-%m-%d-%s create 666 ubuntu ubuntu postrotate mv $1 /mnt/flume/clickstream/ 2>&1 endscript } I run logrotate.d/clickstream.POST with the ~60MB file, and everything looked fine....now I try a ~190MB file.... ls -lrt /var/log/clickstream /var/log/clickstream: total 192336 -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume log. .... .... 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - Could not find file: /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 java.io.FileNotFoundException: /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 (Permission denied) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileReader.<init>(FileReader.java:72) at org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) at org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) 20 Nov 2012 16:45:13,174 INFO [hdfs-c1s3-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:205) - Creating s3n://X:Y@my-bucket /clicks/2012/11/clicks-2012-11-20-16-40-10.145.184.200-.1353429913104.gz.tmp .... .... But it seems that FLume did process the log..... ls -lrt /mnt/flume/clickstream/ total 256704 -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:25 clickstream.log-2012-11-20-1353428715.COMPLETED -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:45 clickstream.log-2012-11-20-1353429906.COMPLETED Regards, Dano On Tue, Nov 20, 2012 at 9:21 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > Yeah I think that makes sense, I have created a JIRA for this > > https://issues.apache.org/jira/browse/FLUME-1733 > > Brock > > On Tue, Nov 20, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > Hey Brock, > > > > I can do some more testing on my side with smaller files as well as > doing a > > mv vs a cp . I do believe that a slight delay would be helpful since > people > > will be moving/copying large files around. > > > > Regards , > > > > Dano > > > > On Nov 20, 2012 5:26 AM, "Brock Noland" <[EMAIL PROTECTED]> wrote: > >> > >> Thinking about this more, I think it's probably going to be quite > >> common for people to cp large files into the spooling directory. > >> Patrick, what do you think about waiting until the mtime is say 1 +
Dan Young 2012-11-20, 16:59
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-20, 17:01
Are /var/log and /mnt/flume on the same file system? mv across file
systems is a cp and then delete. On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> wrote: > Coolio, thank you Brock. > > Did a quick test with a mv vs cp.....this is one test, repeated a few > times....the smaller file (~60M) seemed to work fine w/ a mv, but a larger > file we're seeing the same behavior.... > > > Start with no logs in the SpoolingDirectory: > > ls -lrt /mnt/flume/clickstream > > /mnt/flume/clickstream: > total 0 > > > > Review the logs that will be rotated via logrotate.d into the respective > directory; /mnt/flume/clickstream > > ls -lrt /var/log/clickstream > /var/log/clickstream: > total 64112 > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > > > > Review logrotate config in /etc/logrotate.d. Note here, I changed from cp -p > to a mv..... > > /var/log/clickstream/clickstream.log > { > missingok > rotate 3 > compress > delaycompress > copytruncate > notifempty > size 50M > dateext > dateformat -%Y-%m-%d-%s > create 666 ubuntu ubuntu > postrotate > mv $1 /mnt/flume/clickstream/ 2>&1 > endscript > } > > > I run logrotate.d/clickstream.POST with the ~60MB file, and everything > looked fine....now I try a ~190MB file.... > > > > ls -lrt /var/log/clickstream > /var/log/clickstream: > total 192336 > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log > > > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume log. > > .... > .... > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > Could not find file: > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 > java.io.FileNotFoundException: > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 (Permission > denied) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:138) > at java.io.FileReader.<init>(FileReader.java:72) > at > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > at > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > at > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > 20 Nov 2012 16:45:13,174 INFO [hdfs-c1s3-call-runner-5] > (org.apache.flume.sink.hdfs.BucketWriter.doOpen:205) - Creating > s3n://X:Y@my-bucket/clicks/2012/11/clicks-2012-11-20-16-40-10.145.184.200-.1353429913104.gz.tmp > .... > .... > > But it seems that FLume did process the log..... > > ls -lrt /mnt/flume/clickstream/ > total 256704 > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:25 > clickstream.log-2012-11-20-1353428715.COMPLETED > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:45 > clickstream.log-2012-11-20-1353429906.COMPLETED > > > Regards, > > Dano > > > > > > On Tue, Nov 20, 2012 at 9:21 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >> >> Yeah I think that makes sense, I have created a JIRA for this >> >> https://issues.apache.org/jira/browse/FLUME-1733 >> >> Brock >> >> On Tue, Nov 20, 2012 at 9:02 AM, Dan Young <[EMAIL PROTECTED]> wrote: >> > Hey Brock, >> > >> > I can do some more testing on my side with smaller files as well as >> > doing a Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-20, 17:01
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-20, 17:10
This is running on an EC2 instance, and these are the ephemeral drive(s).
Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvda1 8256952 1618248 6219276 21% / udev 1912272 8 1912264 1% /dev tmpfs 768100 180 767920 1% /run none 5120 0 5120 0% /run/lock none 1920244 0 1920244 0% /run/shm /dev/xvdb 412814512 1127172 390717564 1% /mnt ubuntu@ip-foo:~$ cat /etc/fstab LABEL=cloudimg-rootfs / ext4 defaults 0 0 /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 ubuntu@ip-foo:~$ ubuntu@ip-foo:~$ mount /dev/xvda1 on / type ext4 (rw) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) none on /sys/fs/fuse/connections type fusectl (rw) none on /sys/kernel/debug type debugfs (rw) none on /sys/kernel/security type securityfs (rw) udev on /dev type devtmpfs (rw,mode=0755) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) none on /run/shm type tmpfs (rw,nosuid,nodev) /dev/xvdb on /mnt type ext3 (rw) I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. Dano On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > Are /var/log and /mnt/flume on the same file system? mv across file > systems is a cp and then delete. > > On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > Coolio, thank you Brock. > > > > Did a quick test with a mv vs cp.....this is one test, repeated a few > > times....the smaller file (~60M) seemed to work fine w/ a mv, but a > larger > > file we're seeing the same behavior.... > > > > > > Start with no logs in the SpoolingDirectory: > > > > ls -lrt /mnt/flume/clickstream > > > > /mnt/flume/clickstream: > > total 0 > > > > > > > > Review the logs that will be rotated via logrotate.d into the respective > > directory; /mnt/flume/clickstream > > > > ls -lrt /var/log/clickstream > > /var/log/clickstream: > > total 64112 > > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > > > > > > > > Review logrotate config in /etc/logrotate.d. Note here, I changed from > cp -p > > to a mv..... > > > > /var/log/clickstream/clickstream.log > > { > > missingok > > rotate 3 > > compress > > delaycompress > > copytruncate > > notifempty > > size 50M > > dateext > > dateformat -%Y-%m-%d-%s > > create 666 ubuntu ubuntu > > postrotate > > mv $1 /mnt/flume/clickstream/ 2>&1 > > endscript > > } > > > > > > I run logrotate.d/clickstream.POST with the ~60MB file, and everything > > looked fine....now I try a ~190MB file.... > > > > > > > > ls -lrt /var/log/clickstream > > /var/log/clickstream: > > total 192336 > > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log > > > > > > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume > log. > > > > .... > > .... > > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] > > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - > > Could not find file: > > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 > > java.io.FileNotFoundException: > > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 (Permission > > denied) > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:138) > > at java.io.FileReader.<init>(FileReader.java:72) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) > > at > > > org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:172) > > at > > > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at > > > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) +
Dan Young 2012-11-20, 17:10
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-20, 17:14
OK yeah, so basically it's doing a copy and then a delete. See "man 2
rename" then ERRORS -> EXDEV but basically atomically renaming a file only works on the same mount point. On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <[EMAIL PROTECTED]> wrote: > This is running on an EC2 instance, and these are the ephemeral drive(s). > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/xvda1 8256952 1618248 6219276 21% / > udev 1912272 8 1912264 1% /dev > tmpfs 768100 180 767920 1% /run > none 5120 0 5120 0% /run/lock > none 1920244 0 1920244 0% /run/shm > /dev/xvdb 412814512 1127172 390717564 1% /mnt > > ubuntu@ip-foo:~$ cat /etc/fstab > LABEL=cloudimg-rootfs / ext4 defaults 0 0 > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 > ubuntu@ip-foo:~$ > > ubuntu@ip-foo:~$ mount > /dev/xvda1 on / type ext4 (rw) > proc on /proc type proc (rw,noexec,nosuid,nodev) > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > none on /sys/fs/fuse/connections type fusectl (rw) > none on /sys/kernel/debug type debugfs (rw) > none on /sys/kernel/security type securityfs (rw) > udev on /dev type devtmpfs (rw,mode=0755) > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) > none on /run/shm type tmpfs (rw,nosuid,nodev) > /dev/xvdb on /mnt type ext3 (rw) > > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. > > Dano > > > > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >> >> Are /var/log and /mnt/flume on the same file system? mv across file >> systems is a cp and then delete. >> >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> wrote: >> > Coolio, thank you Brock. >> > >> > Did a quick test with a mv vs cp.....this is one test, repeated a few >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a >> > larger >> > file we're seeing the same behavior.... >> > >> > >> > Start with no logs in the SpoolingDirectory: >> > >> > ls -lrt /mnt/flume/clickstream >> > >> > /mnt/flume/clickstream: >> > total 0 >> > >> > >> > >> > Review the logs that will be rotated via logrotate.d into the respective >> > directory; /mnt/flume/clickstream >> > >> > ls -lrt /var/log/clickstream >> > /var/log/clickstream: >> > total 64112 >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log >> > >> > >> > >> > Review logrotate config in /etc/logrotate.d. Note here, I changed from >> > cp -p >> > to a mv..... >> > >> > /var/log/clickstream/clickstream.log >> > { >> > missingok >> > rotate 3 >> > compress >> > delaycompress >> > copytruncate >> > notifempty >> > size 50M >> > dateext >> > dateformat -%Y-%m-%d-%s >> > create 666 ubuntu ubuntu >> > postrotate >> > mv $1 /mnt/flume/clickstream/ 2>&1 >> > endscript >> > } >> > >> > >> > I run logrotate.d/clickstream.POST with the ~60MB file, and everything >> > looked fine....now I try a ~190MB file.... >> > >> > >> > >> > ls -lrt /var/log/clickstream >> > /var/log/clickstream: >> > total 192336 >> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log >> > >> > >> > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume >> > log. >> > >> > .... >> > .... >> > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] >> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) - >> > Could not find file: >> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 >> > java.io.FileNotFoundException: >> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 (Permission >> > denied) >> > at java.io.FileInputStream.open(Native Method) >> > at java.io.FileInputStream.<init>(FileInputStream.java:138) >> > at java.io.FileReader.<init>(FileReader.java:72) >> > at >> > >> > org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile(SpoolingFileLineReader.java:322) Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-20, 17:14
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-20, 17:17
Ahh...thank you. Let me check some things and make some changes....
On Tue, Nov 20, 2012 at 10:14 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > OK yeah, so basically it's doing a copy and then a delete. See "man 2 > rename" then ERRORS -> EXDEV but basically atomically renaming a file > only works on the same mount point. > > On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > This is running on an EC2 instance, and these are the ephemeral drive(s). > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/xvda1 8256952 1618248 6219276 21% / > > udev 1912272 8 1912264 1% /dev > > tmpfs 768100 180 767920 1% /run > > none 5120 0 5120 0% /run/lock > > none 1920244 0 1920244 0% /run/shm > > /dev/xvdb 412814512 1127172 390717564 1% /mnt > > > > ubuntu@ip-foo:~$ cat /etc/fstab > > LABEL=cloudimg-rootfs / ext4 defaults 0 0 > > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 > > ubuntu@ip-foo:~$ > > > > ubuntu@ip-foo:~$ mount > > /dev/xvda1 on / type ext4 (rw) > > proc on /proc type proc (rw,noexec,nosuid,nodev) > > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > > none on /sys/fs/fuse/connections type fusectl (rw) > > none on /sys/kernel/debug type debugfs (rw) > > none on /sys/kernel/security type securityfs (rw) > > udev on /dev type devtmpfs (rw,mode=0755) > > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) > > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) > > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) > > none on /run/shm type tmpfs (rw,nosuid,nodev) > > /dev/xvdb on /mnt type ext3 (rw) > > > > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. > > > > Dano > > > > > > > > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> > wrote: > >> > >> Are /var/log and /mnt/flume on the same file system? mv across file > >> systems is a cp and then delete. > >> > >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> > wrote: > >> > Coolio, thank you Brock. > >> > > >> > Did a quick test with a mv vs cp.....this is one test, repeated a few > >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a > >> > larger > >> > file we're seeing the same behavior.... > >> > > >> > > >> > Start with no logs in the SpoolingDirectory: > >> > > >> > ls -lrt /mnt/flume/clickstream > >> > > >> > /mnt/flume/clickstream: > >> > total 0 > >> > > >> > > >> > > >> > Review the logs that will be rotated via logrotate.d into the > respective > >> > directory; /mnt/flume/clickstream > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 64112 > >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > >> > > >> > > >> > > >> > Review logrotate config in /etc/logrotate.d. Note here, I changed from > >> > cp -p > >> > to a mv..... > >> > > >> > /var/log/clickstream/clickstream.log > >> > { > >> > missingok > >> > rotate 3 > >> > compress > >> > delaycompress > >> > copytruncate > >> > notifempty > >> > size 50M > >> > dateext > >> > dateformat -%Y-%m-%d-%s > >> > create 666 ubuntu ubuntu > >> > postrotate > >> > mv $1 /mnt/flume/clickstream/ 2>&1 > >> > endscript > >> > } > >> > > >> > > >> > I run logrotate.d/clickstream.POST with the ~60MB file, and everything > >> > looked fine....now I try a ~190MB file.... > >> > > >> > > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 192336 > >> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log > >> > > >> > > >> > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume > >> > log. > >> > > >> > .... > >> > .... > >> > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] > >> > (org.apache.flume.client.avro.SpoolingFileLineReader.getNextFile:328) > - > >> > Could not find file: > >> > /mnt/flume/clickstream/clickstream.log-2012-11-20-1353429906 +
Dan Young 2012-11-20, 17:17
-
Re: .SpoolingFileLineReader warning....Dan Young 2012-11-20, 20:03
Just a quick followup on this....I moved all the logging, etc to the same
drive/device and I'm no longer seeing those permission errors!. I'm going to keep testing this week. Thank you Brock. Regards, Dano On Tue, Nov 20, 2012 at 10:14 AM, Brock Noland <[EMAIL PROTECTED]> wrote: > OK yeah, so basically it's doing a copy and then a delete. See "man 2 > rename" then ERRORS -> EXDEV but basically atomically renaming a file > only works on the same mount point. > > On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <[EMAIL PROTECTED]> wrote: > > This is running on an EC2 instance, and these are the ephemeral drive(s). > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/xvda1 8256952 1618248 6219276 21% / > > udev 1912272 8 1912264 1% /dev > > tmpfs 768100 180 767920 1% /run > > none 5120 0 5120 0% /run/lock > > none 1920244 0 1920244 0% /run/shm > > /dev/xvdb 412814512 1127172 390717564 1% /mnt > > > > ubuntu@ip-foo:~$ cat /etc/fstab > > LABEL=cloudimg-rootfs / ext4 defaults 0 0 > > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 > > ubuntu@ip-foo:~$ > > > > ubuntu@ip-foo:~$ mount > > /dev/xvda1 on / type ext4 (rw) > > proc on /proc type proc (rw,noexec,nosuid,nodev) > > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > > none on /sys/fs/fuse/connections type fusectl (rw) > > none on /sys/kernel/debug type debugfs (rw) > > none on /sys/kernel/security type securityfs (rw) > > udev on /dev type devtmpfs (rw,mode=0755) > > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) > > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) > > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) > > none on /run/shm type tmpfs (rw,nosuid,nodev) > > /dev/xvdb on /mnt type ext3 (rw) > > > > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. > > > > Dano > > > > > > > > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> > wrote: > >> > >> Are /var/log and /mnt/flume on the same file system? mv across file > >> systems is a cp and then delete. > >> > >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> > wrote: > >> > Coolio, thank you Brock. > >> > > >> > Did a quick test with a mv vs cp.....this is one test, repeated a few > >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a > >> > larger > >> > file we're seeing the same behavior.... > >> > > >> > > >> > Start with no logs in the SpoolingDirectory: > >> > > >> > ls -lrt /mnt/flume/clickstream > >> > > >> > /mnt/flume/clickstream: > >> > total 0 > >> > > >> > > >> > > >> > Review the logs that will be rotated via logrotate.d into the > respective > >> > directory; /mnt/flume/clickstream > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 64112 > >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > >> > > >> > > >> > > >> > Review logrotate config in /etc/logrotate.d. Note here, I changed from > >> > cp -p > >> > to a mv..... > >> > > >> > /var/log/clickstream/clickstream.log > >> > { > >> > missingok > >> > rotate 3 > >> > compress > >> > delaycompress > >> > copytruncate > >> > notifempty > >> > size 50M > >> > dateext > >> > dateformat -%Y-%m-%d-%s > >> > create 666 ubuntu ubuntu > >> > postrotate > >> > mv $1 /mnt/flume/clickstream/ 2>&1 > >> > endscript > >> > } > >> > > >> > > >> > I run logrotate.d/clickstream.POST with the ~60MB file, and everything > >> > looked fine....now I try a ~190MB file.... > >> > > >> > > >> > > >> > ls -lrt /var/log/clickstream > >> > /var/log/clickstream: > >> > total 192336 > >> > -rw-rw-r-- 1 ubuntu ubuntu 196945008 Nov 20 16:42 clickstream.log > >> > > >> > > >> > Run logrotate.d/clickstream.POST, and we see the WARNING in the FLume > >> > log. > >> > > >> > .... > >> > .... > >> > 20 Nov 2012 16:45:07,117 WARN [pool-13-thread-1] +
Dan Young 2012-11-20, 20:03
-
Re: .SpoolingFileLineReader warning....Brock Noland 2012-11-20, 20:06
Great to hear it's working for you!
On Tue, Nov 20, 2012 at 2:03 PM, Dan Young <[EMAIL PROTECTED]> wrote: > Just a quick followup on this....I moved all the logging, etc to the same > drive/device and I'm no longer seeing those permission errors!. I'm going > to keep testing this week. > > Thank you Brock. > > Regards, > > Dano > > > On Tue, Nov 20, 2012 at 10:14 AM, Brock Noland <[EMAIL PROTECTED]> wrote: >> >> OK yeah, so basically it's doing a copy and then a delete. See "man 2 >> rename" then ERRORS -> EXDEV but basically atomically renaming a file >> only works on the same mount point. >> >> On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <[EMAIL PROTECTED]> wrote: >> > This is running on an EC2 instance, and these are the ephemeral >> > drive(s). >> > >> > Filesystem 1K-blocks Used Available Use% Mounted on >> > /dev/xvda1 8256952 1618248 6219276 21% / >> > udev 1912272 8 1912264 1% /dev >> > tmpfs 768100 180 767920 1% /run >> > none 5120 0 5120 0% /run/lock >> > none 1920244 0 1920244 0% /run/shm >> > /dev/xvdb 412814512 1127172 390717564 1% /mnt >> > >> > ubuntu@ip-foo:~$ cat /etc/fstab >> > LABEL=cloudimg-rootfs / ext4 defaults 0 0 >> > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 >> > ubuntu@ip-foo:~$ >> > >> > ubuntu@ip-foo:~$ mount >> > /dev/xvda1 on / type ext4 (rw) >> > proc on /proc type proc (rw,noexec,nosuid,nodev) >> > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) >> > none on /sys/fs/fuse/connections type fusectl (rw) >> > none on /sys/kernel/debug type debugfs (rw) >> > none on /sys/kernel/security type securityfs (rw) >> > udev on /dev type devtmpfs (rw,mode=0755) >> > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) >> > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) >> > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) >> > none on /run/shm type tmpfs (rw,nosuid,nodev) >> > /dev/xvdb on /mnt type ext3 (rw) >> > >> > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. >> > >> > Dano >> > >> > >> > >> > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> >> > wrote: >> >> >> >> Are /var/log and /mnt/flume on the same file system? mv across file >> >> systems is a cp and then delete. >> >> >> >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> >> >> wrote: >> >> > Coolio, thank you Brock. >> >> > >> >> > Did a quick test with a mv vs cp.....this is one test, repeated a few >> >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a >> >> > larger >> >> > file we're seeing the same behavior.... >> >> > >> >> > >> >> > Start with no logs in the SpoolingDirectory: >> >> > >> >> > ls -lrt /mnt/flume/clickstream >> >> > >> >> > /mnt/flume/clickstream: >> >> > total 0 >> >> > >> >> > >> >> > >> >> > Review the logs that will be rotated via logrotate.d into the >> >> > respective >> >> > directory; /mnt/flume/clickstream >> >> > >> >> > ls -lrt /var/log/clickstream >> >> > /var/log/clickstream: >> >> > total 64112 >> >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log >> >> > >> >> > >> >> > >> >> > Review logrotate config in /etc/logrotate.d. Note here, I changed >> >> > from >> >> > cp -p >> >> > to a mv..... >> >> > >> >> > /var/log/clickstream/clickstream.log >> >> > { >> >> > missingok >> >> > rotate 3 >> >> > compress >> >> > delaycompress >> >> > copytruncate >> >> > notifempty >> >> > size 50M >> >> > dateext >> >> > dateformat -%Y-%m-%d-%s >> >> > create 666 ubuntu ubuntu >> >> > postrotate >> >> > mv $1 /mnt/flume/clickstream/ 2>&1 >> >> > endscript >> >> > } >> >> > >> >> > >> >> > I run logrotate.d/clickstream.POST with the ~60MB file, and >> >> > everything >> >> > looked fine....now I try a ~190MB file.... >> >> > >> >> > >> >> > >> >> > ls -lrt /var/log/clickstream >> >> > /var/log/clickstream: >> > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ +
Brock Noland 2012-11-20, 20:06
-
Re: .SpoolingFileLineReader warning....Patrick Wendell 2012-11-23, 12:46
Ah got it... great! Glad this got figured out.
On Tue, Nov 20, 2012 at 3:06 PM, Brock Noland <[EMAIL PROTECTED]> wrote: > Great to hear it's working for you! > > On Tue, Nov 20, 2012 at 2:03 PM, Dan Young <[EMAIL PROTECTED]> wrote: > > Just a quick followup on this....I moved all the logging, etc to the same > > drive/device and I'm no longer seeing those permission errors!. I'm > going > > to keep testing this week. > > > > Thank you Brock. > > > > Regards, > > > > Dano > > > > > > On Tue, Nov 20, 2012 at 10:14 AM, Brock Noland <[EMAIL PROTECTED]> > wrote: > >> > >> OK yeah, so basically it's doing a copy and then a delete. See "man 2 > >> rename" then ERRORS -> EXDEV but basically atomically renaming a file > >> only works on the same mount point. > >> > >> On Tue, Nov 20, 2012 at 11:10 AM, Dan Young <[EMAIL PROTECTED]> > wrote: > >> > This is running on an EC2 instance, and these are the ephemeral > >> > drive(s). > >> > > >> > Filesystem 1K-blocks Used Available Use% Mounted on > >> > /dev/xvda1 8256952 1618248 6219276 21% / > >> > udev 1912272 8 1912264 1% /dev > >> > tmpfs 768100 180 767920 1% /run > >> > none 5120 0 5120 0% /run/lock > >> > none 1920244 0 1920244 0% /run/shm > >> > /dev/xvdb 412814512 1127172 390717564 1% /mnt > >> > > >> > ubuntu@ip-foo:~$ cat /etc/fstab > >> > LABEL=cloudimg-rootfs / ext4 defaults 0 0 > >> > /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 > >> > ubuntu@ip-foo:~$ > >> > > >> > ubuntu@ip-foo:~$ mount > >> > /dev/xvda1 on / type ext4 (rw) > >> > proc on /proc type proc (rw,noexec,nosuid,nodev) > >> > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > >> > none on /sys/fs/fuse/connections type fusectl (rw) > >> > none on /sys/kernel/debug type debugfs (rw) > >> > none on /sys/kernel/security type securityfs (rw) > >> > udev on /dev type devtmpfs (rw,mode=0755) > >> > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) > >> > tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) > >> > none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) > >> > none on /run/shm type tmpfs (rw,nosuid,nodev) > >> > /dev/xvdb on /mnt type ext3 (rw) > >> > > >> > I do see that / is ext4 and /mnt is ext3.....hmm, not sure it matters. > >> > > >> > Dano > >> > > >> > > >> > > >> > On Tue, Nov 20, 2012 at 10:01 AM, Brock Noland <[EMAIL PROTECTED]> > >> > wrote: > >> >> > >> >> Are /var/log and /mnt/flume on the same file system? mv across file > >> >> systems is a cp and then delete. > >> >> > >> >> On Tue, Nov 20, 2012 at 10:59 AM, Dan Young <[EMAIL PROTECTED]> > >> >> wrote: > >> >> > Coolio, thank you Brock. > >> >> > > >> >> > Did a quick test with a mv vs cp.....this is one test, repeated a > few > >> >> > times....the smaller file (~60M) seemed to work fine w/ a mv, but a > >> >> > larger > >> >> > file we're seeing the same behavior.... > >> >> > > >> >> > > >> >> > Start with no logs in the SpoolingDirectory: > >> >> > > >> >> > ls -lrt /mnt/flume/clickstream > >> >> > > >> >> > /mnt/flume/clickstream: > >> >> > total 0 > >> >> > > >> >> > > >> >> > > >> >> > Review the logs that will be rotated via logrotate.d into the > >> >> > respective > >> >> > directory; /mnt/flume/clickstream > >> >> > > >> >> > ls -lrt /var/log/clickstream > >> >> > /var/log/clickstream: > >> >> > total 64112 > >> >> > -rw-rw-r-- 1 ubuntu ubuntu 65648336 Nov 20 16:05 clickstream.log > >> >> > > >> >> > > >> >> > > >> >> > Review logrotate config in /etc/logrotate.d. Note here, I changed > >> >> > from > >> >> > cp -p > >> >> > to a mv..... > >> >> > > >> >> > /var/log/clickstream/clickstream.log > >> >> > { > >> >> > missingok > >> >> > rotate 3 > >> >> > compress > >> >> > delaycompress > >> >> > copytruncate > >> >> > notifempty > >> >> > size 50M > >> >> > dateext > >> >> > dateformat -%Y-%m-%d-%s > >> >> > create 666 ubuntu ubuntu +
Patrick Wendell 2012-11-23, 12:46
|