Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Re: file channel read performance impacted by write rate


Copy link to this message
-
Re: file channel read performance impacted by write rate
On 11/13/2013 03:04 PM, Brock Noland wrote:
> The file channel uses a WAL which sits on disk.  Each time an event is
> committed an fsync is called to ensure that data is durable. Without
> this fsync there is no durability guarantee. More details here:
> https://blogs.apache.org/flume/entry/apache_flume_filechannel

Yes indeed. I was just not expecting the performance impact to be that big.

> The issue is that when the source is committing one-by-one it's
> consuming the disk doing an fsync for each event.  I would find a way to
> batch up the requests so they are not written one-by-one or use multiple
> disks for the file channel.

I am already using multiple disks for the channel (4). Batching the
requests is indeed what I am doing to prevent the filechannel to be the
bottleneck (using a flume agent with a memory channel in front of the
agent with the file channel), but it inheritely means that I loose
end-to-end durability because events are buffered in memory before being
flushed to disk.

thanks,
Jan

+
Brock Noland 2013-11-14, 16:07
+
Jan Van Besien 2013-11-13, 09:32
+
Brock Noland 2013-11-13, 14:03
+
Jan Van Besien 2013-11-18, 10:28
+
Brock Noland 2013-12-17, 17:51
+
Jan Van Besien 2013-11-18, 13:21
+
Jan Van Besien 2013-11-25, 08:46
+
Shangan Chen 2013-12-17, 12:27
+
Brock Noland 2013-12-17, 12:54
+
Shangan Chen 2013-12-17, 15:32
+
Brock Noland 2013-12-17, 16:13
+
Hari Shreedharan 2013-12-17, 17:11