Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Recommendation of parameters for better performance with File Channel


+
Jagadish Bihani 2012-12-12, 10:05
+
Jagadish Bihani 2012-12-12, 10:08
Copy link to this message
-
Re: Recommendation of parameters for better performance with File Channel
Hi,

Why not try increasing the batch size on the source and sink to 10,000?

Brock

On Wed, Dec 12, 2012 at 4:08 AM, Jagadish Bihani
<[EMAIL PROTECTED]> wrote:
>
> I am using latest release of flume. (Flume 1.3.0)  and hadoop 1.0.3.
>
>
> On 12/12/2012 03:35 PM, Jagadish Bihani wrote:
>>
>> Hi
>>
>> I am able to write maximum 1.5 MB/sec data to HDFS (without compression)
>> using File Channel. Are there any recommendations to improve the
>> performance?
>> Has anybody achieved around 10 MB/sec with file channel ? If yes please
>> share the
>> configuration like (Hardware used, RAM allocated and batch sizes of
>> source,sink and channels).
>>
>> Following are the configuration details :
>> =======================>>
>> I am using a machine with reasonable hardware configuration:
>> Quadcore 2.00 GHz processors and 4 GB RAM.
>>
>> Command line options passed to flume agent :
>> -DJAVA_OPTS="-Xms1g -Xmx4g -Dcom.sun.management.jmxremote
>> -XX:MaxDirectMemorySize=2g"
>>
>> Agent Configuration:
>> ============>> agent.sources = avro-collection-source spooler
>> agent.channels = fileChannel
>> agent.sinks = hdfsSink fileSink
>>
>> # For each one of the sources, the type is defined
>>
>> agent.sources.spooler.type = spooldir
>> agent.sources.spooler.spoolDir =/root/test_data
>> agent.sources.spooler.batchSize = 1000
>> agent.sources.spooler.channels = fileChannel
>>
>> # Each sink's type must be defined
>> agent.sinks.hdfsSink.type = hdfs
>> agent.sinks.hdfsSink.hdfs.path=hdfs://mltest2001/flume/release3Test
>>
>> agent.sinks.hdfsSink.hdfs.fileType =DataStream
>> agent.sinks.hdfsSink.hdfs.rollSize=0
>> agent.sinks.hdfsSink.hdfs.rollCount=0
>> agent.sinks.hdfsSink.hdfs.batchSize=1000
>> agent.sinks.hdfsSink.hdfs.rollInterval=60
>>
>> agent.sinks.hdfsSink.channel= fileChannel
>>
>> agent.channels.fileChannel.type=file
>> agent.channels.fileChannel.dataDirs=/root/flume_channel/dataDir13
>>
>> agent.channels.fileChannel.checkpointDir=/root/flume_channel/checkpointDir13
>>
>> Regards,
>> Jagadish
>
>

--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
+
Hari Shreedharan 2012-12-12, 17:53
+
Bhaskar V. Karambelkar 2012-12-12, 21:13
+
Hari Shreedharan 2012-12-12, 21:44
+
Jagadish Bihani 2012-12-18, 11:05
+
Juhani Connolly 2012-12-19, 09:23
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB