Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Reliable delivery of the events

Copy link to this message
Re: Reliable delivery of the events
Hari Shreedharan 2012-12-13, 06:20
Yes, if you use Flume's RpcClient interface, then you can lose data based on how your code (which uses the RpcClient code) handles exceptions. If the RpcClient cannot deliver the event it does throw an EventDeliveryException. You could catch this exception and retry, but yes, you will need some way to make sure you can handle multiple failures in some way.

We recently added support for embedding a Flume agent within a host process. In this case, you'd be able to embed a Flume agent within your process, so the embedded Flume agent can handle downstream failure for a longer time. Your code will receive exceptions only when the embedded agent's channel is full or experiences some other failure. This jira contains more information: https://issues.apache.org/jira/browse/FLUME-1502. Unfortunately, this is not yet available in a release, but you could use Flume trunk and build it locally if you need to deploy it urgently.

Hari Shreedharan
On Wednesday, December 12, 2012 at 9:42 PM, Guy Peleg wrote:

> Hi,
> From the documentation: "Flume uses a transactional approach to guarantee the reliable delivery of the events"
> But if I have a multi-hope flow, and the first agent crashes before the message received from the client was stored on the channel then the message is lost since the client is not synched with Flume-NG transaction mechanism, is that correct?
> Thanks,
> Guy