Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Flume Ng replaying events when the source is idle


+
Sagar Mehta 2013-02-27, 19:37
+
Roshan Naik 2013-02-28, 22:43
+
Hari Shreedharan 2013-02-28, 22:59
+
Sagar Mehta 2013-03-04, 22:42
+
Sagar Mehta 2013-03-04, 23:06
+
Hari Shreedharan 2013-03-05, 00:13
Copy link to this message
-
Re: Flume Ng replaying events when the source is idle
Sagar,
Just try "tail -F" on the same file over and over on the command line. It
will display the last few lines.

If you want to avoid this, try "tail -F -n 0 filename" and you should not
see this. Every time you reload your configuration file, the specified
command is re-executed by the source.

Regards,
Mike

On Mon, Mar 4, 2013 at 4:13 PM, Hari Shreedharan
<[EMAIL PROTECTED]>wrote:

>  Flume will reload the configuration file every time it is modified. Since
> puppet rewrites it, Flume reloads it. The events are probably replayed
> because of the transactions being incomplete or something like that. File
> Channel will not replay the events if they have been completely persisted
> to HDFS and transaction closed. If pupper does not rewrite the config file,
> do you see this issue?
>
> --
> Hari Shreedharan
>
> On Monday, March 4, 2013 at 3:06 PM, Sagar Mehta wrote:
>
> I think we found the issue, not sure if this is the root cause but looks
> highly correlated.
>
> So we manage configs using puppet which currently runs in a cron mode with
> following configuration
>
> ## puppetrun Cron Job
> 20,50 * * * *  root sleep $((RANDOM\%60)) > /dev/null 2>&1; puppet agent
> --onetime --no-daemonize --logdest syslog > /dev/null 2>&1
>
>  *Note - the times at which puppet is run along with the time-stamps in
> the listing below.*
>
> Also after combing through flume logs, we noticed Flume is reloading the
> configuration after every puppet run
>
> sagar@drspock ~/temp $ cat flume.log.2013-03-03 | egrep -i "reloading" |
> head -5
> 2013-03-03 00:20:44,174 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 00:51:14,374 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 01:21:15,072 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 01:51:15,778 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
> 2013-03-03 02:20:46,481 [conf-file-poller-0] INFO
>  org.apache.flume.conf.properties.PropertiesFileConfigurationProvider -
> Reloading configuration file:/opt/flume/conf/hdfs.conf
>
> The way we have our current setup, the flume config file
> namely /opt/flume/conf/hdfs.conf is re-written after every puppet run due
> to variable interpolation in the template.
>
>  *We are still not sure what is causing Flume to reload the config file,
> and even if the file is reloaded why are the same events getting replayed
> [the state should be saved somewhere on disk - thats what the file channel
> is for I thought]*
>
> Any pointers/insights appreciated.
>
> Sagar
>
>
> On Mon, Mar 4, 2013 at 2:42 PM, Sagar Mehta <[EMAIL PROTECTED]> wrote:
>
> Guys,
>
> Yes this issue was also seen in the memory channel. In fact when we moved
> to File based channel, we initially thought  this issue won't occur since
> it stores check points.
>
> Anyways below are all files for collector110 [whose source didn't receive
> any events] and you can see all the replays below. I have attached the
> corresponding flume log file for the same day.
>
> hadoop@jobtracker301:/home/smehta$ hls
> /ngpipes-raw-logs/2013-03-03/*/collector110* |  head -5
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:20
> /ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362270044367.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 00:51
> /ngpipes-raw-logs/2013-03-03/0000/collector110.ngpipes.sac.ngmoco.com.1362271875065.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:21
> /ngpipes-raw-logs/2013-03-03/0100/collector110.ngpipes.sac.ngmoco.com.1362273675770.gz
> -rw-r--r--   3 hadoop supergroup       1594 2013-03-03 01:51
+
Sagar Mehta 2013-03-05, 17:53