Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Recommendation of parameters for better performance with File Channel


+
Jagadish Bihani 2012-12-12, 10:05
+
Jagadish Bihani 2012-12-12, 10:08
Copy link to this message
-
Re: Recommendation of parameters for better performance with File Channel
Brock Noland 2012-12-12, 15:36
Hi,

Why not try increasing the batch size on the source and sink to 10,000?

Brock

On Wed, Dec 12, 2012 at 4:08 AM, Jagadish Bihani
<[EMAIL PROTECTED]> wrote:
>
> I am using latest release of flume. (Flume 1.3.0)  and hadoop 1.0.3.
>
>
> On 12/12/2012 03:35 PM, Jagadish Bihani wrote:
>>
>> Hi
>>
>> I am able to write maximum 1.5 MB/sec data to HDFS (without compression)
>> using File Channel. Are there any recommendations to improve the
>> performance?
>> Has anybody achieved around 10 MB/sec with file channel ? If yes please
>> share the
>> configuration like (Hardware used, RAM allocated and batch sizes of
>> source,sink and channels).
>>
>> Following are the configuration details :
>> =======================>>
>> I am using a machine with reasonable hardware configuration:
>> Quadcore 2.00 GHz processors and 4 GB RAM.
>>
>> Command line options passed to flume agent :
>> -DJAVA_OPTS="-Xms1g -Xmx4g -Dcom.sun.management.jmxremote
>> -XX:MaxDirectMemorySize=2g"
>>
>> Agent Configuration:
>> ============>> agent.sources = avro-collection-source spooler
>> agent.channels = fileChannel
>> agent.sinks = hdfsSink fileSink
>>
>> # For each one of the sources, the type is defined
>>
>> agent.sources.spooler.type = spooldir
>> agent.sources.spooler.spoolDir =/root/test_data
>> agent.sources.spooler.batchSize = 1000
>> agent.sources.spooler.channels = fileChannel
>>
>> # Each sink's type must be defined
>> agent.sinks.hdfsSink.type = hdfs
>> agent.sinks.hdfsSink.hdfs.path=hdfs://mltest2001/flume/release3Test
>>
>> agent.sinks.hdfsSink.hdfs.fileType =DataStream
>> agent.sinks.hdfsSink.hdfs.rollSize=0
>> agent.sinks.hdfsSink.hdfs.rollCount=0
>> agent.sinks.hdfsSink.hdfs.batchSize=1000
>> agent.sinks.hdfsSink.hdfs.rollInterval=60
>>
>> agent.sinks.hdfsSink.channel= fileChannel
>>
>> agent.channels.fileChannel.type=file
>> agent.channels.fileChannel.dataDirs=/root/flume_channel/dataDir13
>>
>> agent.channels.fileChannel.checkpointDir=/root/flume_channel/checkpointDir13
>>
>> Regards,
>> Jagadish
>
>

--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
+
Hari Shreedharan 2012-12-12, 17:53
+
Bhaskar V. Karambelkar 2012-12-12, 21:13
+
Hari Shreedharan 2012-12-12, 21:44
+
Jagadish Bihani 2012-12-18, 11:05
+
Juhani Connolly 2012-12-19, 09:23