Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Design problem while monitoring Flume


Copy link to this message
-
Re: Design problem while monitoring Flume
Anat Rozenzon 2013-08-28, 12:59
Thank you for the quick answer.

How can I process events after they have been written? is there any
post-write interceptor I can code?
On Wed, Aug 28, 2013 at 11:45 AM, Juhani Connolly <
[EMAIL PROTECTED]> wrote:

> The most common cause of resending events from the source would be failure
> to write to the channel. Most of the time this would be because the channel
> is full.
>
> An approach to collecting statistics will vary on what exactly you want to
> do, but perhaps you could write metadata to headers in the interceptor and
> than batch process the serialized headers after events have actually been
> written. Or if you need to be realtime you can replicate events to an
> additional path which leads to a custom sink that collects statistics. So
> long as the sink doesn't "bounce" events(rollback transactions) it
> shouldn't get any events resent.
>
> One thing to keep in mind though is that flume in general only guarantees
> delivery, it doesn't guarantee that stuff will only be delivered
> once(though many components do only deliver once)
>
>
> On 08/28/2013 04:09 PM, Anat Rozenzon wrote:
>
>> Hi,
>>
>> I want to get some statistics out of Flume (For example, how many records
>> were collected, How many files etc.).
>> I've written my own interceptor that updates an MBean whenever records
>> arrive.
>>
>> I've also written a MonitorServices that collects the data from the MBean
>> every X minutes and send it to a database.
>>
>> My problem is that sometimes events are resent again from the source, I
>> saw that while debugging.
>> Not sure why... maybe because of a timeout while sending to the sink?
>>
>> Anyway, if this happens in production it will corrupt my statistics.
>>
>> Is there any way I can know that an event have failed reaching the sink
>> eventhough it passed the interceptor?
>> Is there a better place to collect such statistics than an interceptor?
>>
>> Thanks
>> Anat
>>
>
>