Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Dupes


+
Cochran, David 2013-04-05, 13:52
Hi Dave,

Could you post your agents configuration file?

Sometimes, little mis-configurations can result in un-intended or undefined
behaviors.

On Fri, Apr 5, 2013 at 9:52 AM, Cochran, David <[EMAIL PROTECTED]>wrote:

> I'm seeing a LOT of random dupes in some of my log files....
>
> This is pretty consistent in one in particular that's being tail'ed
> averages ~20M per day, everyday.  On the only sink (FILE_ROLL) the
> resulting 24hour log is 55M.  Just some quick counts grep'ing a random time
> (ie 07:23) shows the sink log with a dozen or so more lines with the same
> timestamp than the source has every minute.
>
> But this is happening like clockwork everyday for the last couple months
> when I started using Flume on this box.
>
> I did check that there wasn't another source from this or another server
> sending to the same port...and the entries of the log file look proper for
> that app.
>
> The logs are not rolling at the same time on the source/sink and I've not
> yet taken the time to set up copies of each begining and ending at the same
> times and run a diff against them, but a preliminary 'eyeball diff' just
> shows dupes.  I will note on the source a line with the exact same text may
> appear more than once as the logging mechanism does not log more precise
> then hour/minute.
>
> All in all, dupes are better than drops, but is there anything in
> particular I should look for to try to find the cause of and eliminate this?
>
>
> Thanks in advance for any thoughts,
> Dave
>
>
>
>
+
Cochran, David 2013-04-05, 14:22