-Re: Design problem while monitoring Flume
Anat Rozenzon 2013-08-28, 12:59
Thank you for the quick answer.
How can I process events after they have been written? is there any
post-write interceptor I can code?
On Wed, Aug 28, 2013 at 11:45 AM, Juhani Connolly <
[EMAIL PROTECTED]> wrote:
> The most common cause of resending events from the source would be failure
> to write to the channel. Most of the time this would be because the channel
> is full.
> An approach to collecting statistics will vary on what exactly you want to
> do, but perhaps you could write metadata to headers in the interceptor and
> than batch process the serialized headers after events have actually been
> written. Or if you need to be realtime you can replicate events to an
> additional path which leads to a custom sink that collects statistics. So
> long as the sink doesn't "bounce" events(rollback transactions) it
> shouldn't get any events resent.
> One thing to keep in mind though is that flume in general only guarantees
> delivery, it doesn't guarantee that stuff will only be delivered
> once(though many components do only deliver once)
> On 08/28/2013 04:09 PM, Anat Rozenzon wrote:
>> I want to get some statistics out of Flume (For example, how many records
>> were collected, How many files etc.).
>> I've written my own interceptor that updates an MBean whenever records
>> I've also written a MonitorServices that collects the data from the MBean
>> every X minutes and send it to a database.
>> My problem is that sometimes events are resent again from the source, I
>> saw that while debugging.
>> Not sure why... maybe because of a timeout while sending to the sink?
>> Anyway, if this happens in production it will corrupt my statistics.
>> Is there any way I can know that an event have failed reaching the sink
>> eventhough it passed the interceptor?
>> Is there a better place to collect such statistics than an interceptor?