Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Guarantees of the memory channel for delivering to sink


Copy link to this message
-
Guarantees of the memory channel for delivering to sink
Rahul Ravindran 2012-11-06, 21:32
Hi,
   I am very new to Flume and we are hoping to use it for our log aggregation into HDFS. I have a few questions below:

FileChannel will double our disk IO, which will affect IO performance on certain performance sensitive machines. Hence, I was hoping to write a custom Flume source which will use a memory channel, and which will perform checkpointing. The checkpoint will be updated each time we perform a successive insertion into the memory channel. (I realize that this results in a risk of data, the maximum size of which is the capacity of the memory channel).

   As long as there is capacity in the memory channel buffers, does the memory channel guarantee delivery to a sink (does it wait for acknowledgements, and retry failed packets)? This would mean that we need to ensure that we do not exceed the channel capacity.

I am writing a custom source which will use the memory channel, and which will catch a ChannelException to identify any channel capacity issues(so, buffer used in the memory channel is full because of lagging sinks/network issues etc). Is that a reasonable assumption to make?

Thanks,
~Rahul.