Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> how to implement a tail or tailDir of flume-ng on windows?


+
周梦想 2013-02-21, 03:37
+
dan young 2013-02-21, 05:01
+
周梦想 2013-02-21, 06:22
+
Juhani Connolly 2013-02-21, 05:42
Copy link to this message
-
Re: how to implement a tail or tailDir of flume-ng on windows?
Hi Juhani,
I wrote a python script tail.py as below:
import time, os
import sys
#Set the filename and open the file
#filename = 'security_log'

def tail_f(file):
  interval = 1.0

  while True:
    where = file.tell()
    line = file.readline()
    if not line:
      time.sleep(interval)
      file.seek(where)
    else:
      yield line
for line in tail_f(open(sys.argv[1])):
  print line,

tail.bat:
C:\Python27\python.exe D:\apache-flume-1.3.1-bin\tail.py d:\data.log

I changed conf file to :
agent1.sources.userlogsrc.type = exec
agent1.sources.userlogsrc.command "D:\\apache-flume-1.3.1-bin\\bin\\tail.bat"

this node tail the file, sink is avro, send to another node source is avro.
I run my flume.bat, it gives nothing error, I can see the connection is ok,
but does not send any data to flume-ng.

if i change config file to :
agent1.sources.userlogsrc.command = "C:\\Python27\\python.exe
D:\\apache-flume-1.3.1-bin\\tail.py d:\\data.log"

run the flume.bat,it report error:
2013-02-21 15:21:08,622 (pool-4-thread-1) [ERROR -
org.apache.flume.source.ExecS
ource$ExecRunnable.run(ExecSource.java:284)] Failed while running command:
"C:\P
ython27\python.exe D:\apache-flume-1.3.1-bin\tail.py d:\data.log"
java.io.IOException: Cannot run program ""C:\Python27\python.exe":
CreateProcess
 error=2, ?????????
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at
org.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:2
59)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:44
1)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: CreateProcess error=2, ?????????
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:81)
        at java.lang.ProcessImpl.start(ProcessImpl.java:30)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 7 more
2013-02-21 15:21:08,651 (pool-4-thread-1) [INFO -
org.apache.flume.source.ExecSo
urce$ExecRunnable.run(ExecSource.java:307)] Command
["C:\Python27\python.exe D:\
apache-flume-1.3.1-bin\tail.py d:\data.log"] exited with -1073741824

I don't know why the exec source can't run python program?

Thanks,
Andy
2013/2/21 Juhani Connolly <[EMAIL PROTECTED]>

> You'd want to just periodically stat the file to be tailed, checking for
> change in last modified/size, and read the difference out of it. You could
> always download the source for tail itself and see how it does it:
> http://git.savannah.gnu.org/**cgit/coreutils.git/tree/src/**tail.c<http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tail.c>
>
> If you're going to write this to feed data to flume you're better off
> having it send data over thrift to flume so you can resend it on failures.
>
>
> On 02/21/2013 12:37 PM, 周梦想 wrote:
>
>> hello,
>>
>> there isn't tail or tailDir source of flume-ng.
>> exec source can run tail command on linux.
>> but there is not a tail command on windows. So I have to write some code
>> to do the same work.
>> I want to read a file and if there is new lines of a file, it sends the
>> lines to flume-ng.
>>
>> some one give me some advice?
>>
>> Thanks,
>> Andy
>>
>>
>
+
周梦想 2013-02-21, 08:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB