Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Recommendation of parameters for better performance with File Channel


+
Jagadish Bihani 2012-12-12, 10:05
+
Jagadish Bihani 2012-12-12, 10:08
+
Brock Noland 2012-12-12, 15:36
+
Hari Shreedharan 2012-12-12, 17:53
+
Bhaskar V. Karambelkar 2012-12-12, 21:13
Copy link to this message
-
Re: Recommendation of parameters for better performance with File Channel
Yep, each sink with a different prefix will work fine too. My suggestion was just meant to avoid collision - file prefixes are good enough for that.

--
Hari Shreedharan
On Wednesday, December 12, 2012 at 1:13 PM, Bhaskar V. Karambelkar wrote:

> Hari,
> If each sink uses a different file prefix, what's the need to write to
> multiple HDFS directories.
> All our sinks write to the same HDFS directory and each uses a unique
> file prefix, and it seems to work fine.
> Also haven't found anything in flume code or HDFS APIs which suggest
> that two sinks can't write to the same directory.
>
> Just curious.
> thanks
>
>
> On Wed, Dec 12, 2012 at 12:53 PM, Hari Shreedharan
> <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
> > Also note that having multiple sinks often improves performance - though you
> > should have each sink write to a different directory on HDFS. Since each
> > sink really uses only on thread at a time to write, having multiple sinks
> > allows multiple threads to write to HDFS. Also if you can spare additional
> > disks on your Flume agent machine for file channel data directories, that
> > will also improve performance.
> >
> >
> >
> > Hari
> >
> > --
> > Hari Shreedharan
> >
> > On Wednesday, December 12, 2012 at 7:36 AM, Brock Noland wrote:
> >
> > Hi,
> >
> > Why not try increasing the batch size on the source and sink to 10,000?
> >
> > Brock
> >
> > On Wed, Dec 12, 2012 at 4:08 AM, Jagadish Bihani
> > <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
> >
> >
> > I am using latest release of flume. (Flume 1.3.0) and hadoop 1.0.3.
> >
> >
> > On 12/12/2012 03:35 PM, Jagadish Bihani wrote:
> >
> >
> > Hi
> >
> > I am able to write maximum 1.5 MB/sec data to HDFS (without compression)
> > using File Channel. Are there any recommendations to improve the
> > performance?
> > Has anybody achieved around 10 MB/sec with file channel ? If yes please
> > share the
> > configuration like (Hardware used, RAM allocated and batch sizes of
> > source,sink and channels).
> >
> > Following are the configuration details :
> > =======================> >
> > I am using a machine with reasonable hardware configuration:
> > Quadcore 2.00 GHz processors and 4 GB RAM.
> >
> > Command line options passed to flume agent :
> > -DJAVA_OPTS="-Xms1g -Xmx4g -Dcom.sun.management.jmxremote
> > -XX:MaxDirectMemorySize=2g"
> >
> > Agent Configuration:
> > ============> > agent.sources = avro-collection-source spooler
> > agent.channels = fileChannel
> > agent.sinks = hdfsSink fileSink
> >
> > # For each one of the sources, the type is defined
> >
> > agent.sources.spooler.type = spooldir
> > agent.sources.spooler.spoolDir =/root/test_data
> > agent.sources.spooler.batchSize = 1000
> > agent.sources.spooler.channels = fileChannel
> >
> > # Each sink's type must be defined
> > agent.sinks.hdfsSink.type = hdfs
> > agent.sinks.hdfsSink.hdfs.path=hdfs://mltest2001/flume/release3Test
> >
> > agent.sinks.hdfsSink.hdfs.fileType =DataStream
> > agent.sinks.hdfsSink.hdfs.rollSize=0
> > agent.sinks.hdfsSink.hdfs.rollCount=0
> > agent.sinks.hdfsSink.hdfs.batchSize=1000
> > agent.sinks.hdfsSink.hdfs.rollInterval=60
> >
> > agent.sinks.hdfsSink.channel= fileChannel
> >
> > agent.channels.fileChannel.type=file
> > agent.channels.fileChannel.dataDirs=/root/flume_channel/dataDir13
> >
> > agent.channels.fileChannel.checkpointDir=/root/flume_channel/checkpointDir13
> >
> > Regards,
> > Jagadish
> >
> >
> >
> >
> > --
> > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> >
>
>
>
+
Jagadish Bihani 2012-12-18, 11:05
+
Juhani Connolly 2012-12-19, 09:23