Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume Ng replaying events when the source is idle


Copy link to this message
-
Re: Flume Ng replaying events when the source is idle
Sagar,
Just try "tail -F" on the same file over and over on the command line. It
will display the last few lines.

If you want to avoid this, try "tail -F -n 0 filename" and you should not
see this. Every time you reload your configuration file, the specified
command is re-executed by the source.

Regards,
Mike

On Mon, Mar 4, 2013 at 4:13 PM, Hari Shreedharan
<[EMAIL PROTECTED]>wrote:

>  Flume will reload the configuration file every time it is modified. Since
> puppet rewrites it, Flume reloads it. The events are probably replayed
> because of the transactions being incomplete or something like that. File
> Channel will not replay the events if they have been completely persisted
> to HDFS and transaction closed. If pupper does not rewrite the config file,
> do you see this issue?
>
> --
> Hari Shreedharan
>
> On Monday, March 4, 2013 at 3:06 PM, Sagar Mehta wrote:
>
> I think we found the issue, not sure if this is the root cause but looks
> highly correlated.
>
> So we manage configs using puppet which currently runs in a cron mode with
> following configuration
>
> ## puppetrun Cron Job
> 20,50 * * * *  root sleep $((RANDOM\%60)) > /dev/null 2>&1; puppet agent
> --onetime --no-daemonize --logdest syslog > /dev/null 2>&1
>
>  *Note - the times at which puppet is run along with the time-stamps in
> the listing below.*
>
> Also after combing through flume logs, we noticed Flume is reloading the
> configuration after every puppet run
>
> sagar@drspock ~/temp $ cat flume.log.2013-03-03 | egrep -i "reloading" |
> head -5
> 2013-03-03 00:20:44,174 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 00:51:14,374 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 01:21:15,072 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 01:51:15,778 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 02:20:46,481 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
>
> The way we have our current setup, the flume config file
> namely /opt/flume/conf/hdfs.conf is re-written after every puppet run due
> to variable interpolation in the template.
>
>  *We are still not sure what is causing Flume to reload the config file,
> and even if the file is reloaded why are the same events getting replayed
> [the state should be saved somewhere on disk - thats what the file channel
> is for I thought]*
>
> Any pointers/insights appreciated.
>
> Sagar
>
>
> On Mon, Mar 4, 2013 at 2:42 PM, Sagar Mehta <[EMAIL PROTECTED]> wrote:
>
> Guys,
>
> Yes this issue was also seen in the memory channel. In fact when we moved
> to File based channel, we initially thought  this issue won't occur since
> it stores check points.
>
> Anyways below are all files for collector110 [whose source didn't receive
> any events] and you can see all the replays below. I have attached the
> corresponding flume log file for the same day.
>
> hadoop@jobtracker301:/home/smehta$ hls
> /ngpipes-raw-logs/2013-03-03/*/collector110* |  head -5
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:20
> /ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362270044367.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:51
> /ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362271875065.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:21
> /ngpipes-raw-logs/2013-03-03/0100/collector110.ngpipes.sac.ngmoco.com.1362273675770.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:51
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB