Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume Ng replaying events when the source is idle


Copy link to this message
-
Re: Flume Ng replaying events when the source is idle
Guys,

Yes this issue was also seen in the memory channel. In fact when we moved
to File based channel, we initially thought  this issue won't occur since
it stores check points.

Anyways below are all files for collector110 [whose source didn't receive
any events] and you can see all the replays below. I have attached the
corresponding flume log file for the same day.

hadoop@jobtracker301:/home/smehta$ hls
/ngpipes-raw-logs/2013-03-03/*/collector110* |  head -5
-rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:20
/ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362270044367.gz
-rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:51
/ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362271875065.gz
-rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:21
/ngpipes-raw-logs/2013-03-03/0100/collector110.ngpipes.sac.ngmoco.com.1362273675770.gz
-rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:51
/ngpipes-raw-logs/2013-03-03/0100/collector110.ngpipes.sac.ngmoco.com.1362275476474.gz
-rw-r--r--   3 hadoop supergroup       1594 2013-03-03 02:20
/ngpipes-raw-logs/2013-03-03/0200/collector110.ngpipes.sac.ngmoco.com.1362277246704.gz

Also in the attached flume log, you can see the replay stuff I'm talking
about - Please note the source received no events during this time.

sagar@drspock ~/temp $ cat flume.log.2013-03-03 | egrep -i "Queue Size
after replay" | head
2013-03-03 00:20:44,355 [lifecycleSupervisor-1-3] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel1]
2013-03-03 00:20:44,356 [lifecycleSupervisor-1-4] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel2]
2013-03-03 00:51:14,571 [lifecycleSupervisor-1-7] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 0
[channel=channel2]
2013-03-03 00:51:14,577 [lifecycleSupervisor-1-1] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel1]
2013-03-03 01:21:15,276 [lifecycleSupervisor-1-8] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 0
[channel=channel2]
2013-03-03 01:21:15,281 [lifecycleSupervisor-1-7] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel1]
2013-03-03 01:51:15,979 [lifecycleSupervisor-1-9] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 0
[channel=channel2]
2013-03-03 01:51:15,985 [lifecycleSupervisor-1-5] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel1]
2013-03-03 02:20:46,697 [lifecycleSupervisor-1-2] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel1]
2013-03-03 02:20:46,697 [lifecycleSupervisor-1-8] INFO
 org.apache.flume.channel.file.FileChannel - Queue Size after replay: 10
[channel=channel2]

As for the contents of the file, yes they are exactly the same 10 lines of
events replayed over and over - I checked that.

Let me know if you guys have any insights into this or if this is a bug in
Flume Ng.

Sagar
On Thu, Feb 28, 2013 at 2:59 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

>  Can also send the flume agent logs? Did you check the contents of the
> files?
>
> --
> Hari Shreedharan
>
> On Thursday, February 28, 2013 at 2:43 PM, Roshan Naik wrote:
>
> would you be able to you verify if the same problem can be reproduced by
> using the memory channel instead in a test setup ?
>
>
> On Wed, Feb 27, 2013 at 11:37 AM, Sagar Mehta <[EMAIL PROTECTED]>wrote:
>
> Hi Guys,
>
> I'm using Flume-Ng and it is working pretty well except for a weird
> situation which I observed lately. In essence I'm using an exec source for
> doing  tail -F on a logfile and using two HDFS sinks with a File channel.
>
> However I have observed that when the source [ logfile of a jetty based
> collector] is idle - that is no new events are pushed to the logFile,
> FlumeNg seems to replay the same set of events.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB