Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> flume tail source problem and performance


+
周梦想 2013-01-29, 07:24
Copy link to this message
-
Re: flume tail source problem and performance
Hi,

you could use tail -F, but this depends on the external source. Flume hasn't control about. You can write your own script and include this.

What's the content of:
/tmp/flume/agent/agent*.*/ - directories? Are sent and sending clean?

- Alex

On Jan 29, 2013, at 8:24 AM, 周梦想 <[EMAIL PROTECTED]> wrote:

> hello,
> 1. I want to tail a log source and write it to hdfs. below is configure:
> config [ag1, tail("/home/zhouhh/game.log",startFromEnd=true),
> agentDFOSink("hadoop48",35853) ;]
> config [ag2, tail("/home/zhouhh/game.log",startFromEnd=true),
> agentDFOSink("hadoop48",35853) ;]
> config [co1, collectorSource( 35853 ),  [collectorSink(
> "hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-",5000,raw),collectorSink(
> "hdfs://hadoop48:54310/user/flume/%y%m","%{host}-",10000,raw)]]
>
>
> I found if I restart the agent node, it will resend the content of game.log
> to collector. There are some solutions to send logs from where I haven't
> sent before? Or I have to make a mark myself or remove the logs manually
> when restart the agent node?
>
> 2. I tested performance of flume, and found it's a bit slow.
> if I using configure as above, there are only 50MB/minute.
> I changed the configure to below:
> ag1:tail("/home/zhouhh/game.log",startFromEnd=true)|batch(1000) gzip
> agentDFOSink("hadoop48",35853);
>
> config [co1, collectorSource( 35853 ), [collectorSink(
> "hdfs://hadoop48:54310/user/flume/%y%m/%d","%{host}-",5000,raw),collectorSink(
> "hdfs://hadoop48:54310/user/flume/%y%m","%{host}-",10000,raw)]]
>
> I sent 300MB log, it will spent about 3 minutes, so it's about 100MB/minute.
>
> while I send the log from ag1 to co1 via scp, It's about 30MB/second.
>
> someone give me any ideas?
>
> thanks!
>
> Andy

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF
+
Jeong-shik Jang 2013-01-29, 07:41
+
周梦想 2013-02-04, 07:27
+
Jeong-shik Jang 2013-02-04, 07:47
+
周梦想 2013-02-04, 08:07
+
Jeong-shik Jang 2013-02-04, 08:13
+
GuoWei 2013-02-04, 11:46
+
周梦想 2013-02-06, 02:47
+
周梦想 2013-02-04, 07:33
+
Alexander Alten-Lorenz 2013-02-04, 07:39