Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> how to implement a tail or tailDir of flume-ng on windows?


+
周梦想 2013-02-21, 03:37
+
dan young 2013-02-21, 05:01
+
周梦想 2013-02-21, 06:22
+
Juhani Connolly 2013-02-21, 05:42
+
周梦想 2013-02-21, 07:23
Copy link to this message
-
Re: how to implement a tail or tailDir of flume-ng on windows?
Sorry, it seems OK after i change config file to:
agent1.sources.userlogsrc.command = C:\\Python27\\python.exe
D:\\apache-flume-1.3.1-bin\\tail.py E:\\mydoc\\gamelog\\game.log

I removed the " " of the command,and create python process ok.
So the bat file is also can run as the command.

problem that not generate file to hdfs before, maybe it is because of the
data.log is too small? it's only has a few lines,while game.log is about
400MB.
And i set agent1.channels.memch1.capacity = 10000 ?

I'll test more of this.
Thanks all.
Andy

2013/2/21 周梦想 <[EMAIL PROTECTED]>

> Hi Juhani,
> I wrote a python script tail.py as below:
> import time, os
> import sys
> #Set the filename and open the file
> #filename = 'security_log'
>
> def tail_f(file):
>   interval = 1.0
>
>   while True:
>     where = file.tell()
>     line = file.readline()
>     if not line:
>       time.sleep(interval)
>       file.seek(where)
>     else:
>       yield line
> for line in tail_f(open(sys.argv[1])):
>   print line,
>
> tail.bat:
> C:\Python27\python.exe D:\apache-flume-1.3.1-bin\tail.py d:\data.log
>
> I changed conf file to :
> agent1.sources.userlogsrc.type = exec
> agent1.sources.userlogsrc.command > "D:\\apache-flume-1.3.1-bin\\bin\\tail.bat"
>
> this node tail the file, sink is avro, send to another node source is avro.
> I run my flume.bat, it gives nothing error, I can see the connection is
> ok, but does not send any data to flume-ng.
>
> if i change config file to :
> agent1.sources.userlogsrc.command = "C:\\Python27\\python.exe
> D:\\apache-flume-1.3.1-bin\\tail.py d:\\data.log"
>
> run the flume.bat,it report error:
> 2013-02-21 15:21:08,622 (pool-4-thread-1) [ERROR -
> org.apache.flume.source.ExecS
> ource$ExecRunnable.run(ExecSource.java:284)] Failed while running command:
> "C:\P
> ython27\python.exe D:\apache-flume-1.3.1-bin\tail.py d:\data.log"
> java.io.IOException: Cannot run program ""C:\Python27\python.exe":
> CreateProcess
>  error=2, ?????????
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>         at
> org.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:2
> 59)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:44
> 1)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
> utor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: CreateProcess error=2, ?????????
>         at java.lang.ProcessImpl.create(Native Method)
>         at java.lang.ProcessImpl.<init>(ProcessImpl.java:81)
>         at java.lang.ProcessImpl.start(ProcessImpl.java:30)
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
>         ... 7 more
> 2013-02-21 15:21:08,651 (pool-4-thread-1) [INFO -
> org.apache.flume.source.ExecSo
> urce$ExecRunnable.run(ExecSource.java:307)] Command
> ["C:\Python27\python.exe D:\
> apache-flume-1.3.1-bin\tail.py d:\data.log"] exited with -1073741824
>
> I don't know why the exec source can't run python program?
>
> Thanks,
> Andy
>
> 2013/2/21 Juhani Connolly <[EMAIL PROTECTED]>
>
>> You'd want to just periodically stat the file to be tailed, checking for
>> change in last modified/size, and read the difference out of it. You could
>> always download the source for tail itself and see how it does it:
>> http://git.savannah.gnu.org/**cgit/coreutils.git/tree/src/**tail.c<http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tail.c>
>>
>> If you're going to write this to feed data to flume you're better off
>> having it send data over thrift to flume so you can resend it on failures.
>>
>>
>> On 02/21/2013 12:37 PM, 周梦想 wrote:
>>
>>> hello,
>>>
>>> there isn't tail or tailDir source of flume-ng.
>>> exec source can run tail command on linux.