Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - file channel read performance impacted by write rate


Copy link to this message
-
file channel read performance impacted by write rate
Jan Van Besien 2013-11-13, 09:32
Hi,

I noticed that the rate by which sinks can read from a file channel, is
heavily dependant on the rate by which sources are writing into that
file channel.

It can be easily tested with the null sink and a source that writes
events one by one into the file channel.

If the source writes events one by one (at the maximum speed the file
channel can handle), the rate at the sink is easily more than 10 times
slower than if the source is not writing at all, or batching the writes.

I can understand that there is an impact, but the impact seems really
big.. I have a case here where the write rate in the file channel
(events are written one by one) is actually good enough, but the read
rate suffers so much that that becomes a bottleneck. I can solve it with
a memory channel in front such that the writes in the file channel are
done in batches, but that means I loose overall durability of the events.

Any insights on this?

Thanks,
Jan