Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Configuring flume for better throughput


Copy link to this message
-
Re: Configuring flume for better throughput
A lot of it depends on the disks you are using and how many disks you have given the file channel. In general, performance improves if you give it more disks, as it round-robins between disks, so multiple writes and reads can happen without waiting for a full seek.

Also, the file channel does write every event to disk when they are written to the channel - and when they are read, they are read back from disk (See the Log Structured File System paper for details on the basic design).  This allows the channel to hold more events than can fit in memory and also allows full recovery from failure. I'd recommend using a Null sink or a custom sink that updates some metrics (and does nothing else) to see if the File Channel is really your bottle neck.
Thanks,
Hari
On Wednesday, July 31, 2013 at 7:24 PM, Pankaj Gupta wrote:

> Also, agent1.sinks.hdfs-sink1-1.hdfs.threadsPoolSize = 1, might seem odd but we only write to one file on HDFS per sink, so 1 seems to be the right value. In any case, I've tried increasing this value to 10 to no effect.
>
>
> On Wed, Jul 31, 2013 at 7:22 PM, Pankaj Gupta <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
> > I'm continuing to debug the performance issues, added more sinks but it all seems to be boiling down to the performance of the FileChannel. Right now I'm focusing on the performance of the HDFS Writer machine. On that machine I have 4 disks(apart from a separate disk just for the OS), so I'm using 4 file channels with checkpoint + data directories on their own dedicated disk. As mentioned earlier, Avro Sinks write to these FileChannels and HDFS Sinks drain the channel. I'm getting very poor performance draining the channels, ~2.5MB/s for all 4 channels combined. I replaced the file channel with memory channel just to test and saw that I could drain the channels at more than 15 MB/s. So HDFS sinks aren't the issue.
> >
> > I haven't seen any issue with writing to the FileChannel so far, I'm surprised that reading is turning out to be slower. Here are the FileChannel stats:
> > "CHANNEL.ch1": {
> >         "ChannelCapacity": "75000000",
> >         "ChannelFillPercentage": "7.5033080000000005",
> >         "ChannelSize": "5627481",
> >         "EventPutAttemptCount": "11465743",
> >         "EventPutSuccessCount": "11465481",
> >         "EventTakeAttemptCount": "5841907",
> >         "EventTakeSuccessCount": "5838000",
> >         "StartTime": "1375320933471",
> >         "StopTime": "0",
> >         "Type": "CHANNEL"
> >     },
> >    
> >
> > EventTakeAttemptCount is much less than EventPutAttemptCount and the sinks are lagging. I'm surprised how even the attempts to drain the channel are lesser. That would seem to point to the HDFS sinks but they do just fine with the Memory Channel, so they are clearly not bound on either writing to HDFS or on network I/O. I've checked the network capacity separately as well and we are using less than 10% of the network capacity, thus definitely not bound there.
> >
> > In my workflow reliability of FileChannel is essential thus can't switch to Memory channel. I would really appreciate any suggestions on how to tune the performance of FileChannel. Here are the settings of one of the FileChannels:
> >
> > agent1.channels.ch1.type = FILE
> > agent1.channels.ch1.checkpointDir = /flume1/checkpoint
> > agent1.channels.ch1.dataDirs = /flume1/data
> > agent1.channels.ch1.maxFileSize = 375809638400
> > agent1.channels.ch1.capacity = 75000000
> >
> > agent1.channels.ch1.transactionCapacity = 24000
> > agent1.channels.ch1. checkpointInterval = 300000
> >
> >
> > As can be seen I increased the checkpointInterval but that didn't help either.
> >
> > Here are the settings for one of the HDFS Sinks. I have tried varying the number of these sinks from 8 to 32 to no effect:
> > agent1.sinks.hdfs-sink1-1.channel = ch1
> > agent1.sinks.hdfs-sink1-1.type = hdfs
> > #Use DNS of the HDFS namenode
> > agent1.sinks.hdfs-sink1-1.hdfs.path = hdfs://nameservice1/store/f-1-1/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB