Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Why flume cannot take all bandwidth.


Copy link to this message
-
Why flume cannot take all bandwidth.
Hello guys:
  I am now doing flume-ng performance test in EC2 instance. And I have a
tier-2 framework, avro client post a 1G file to avro source, and then the
file is store to    HDFS by hdfsSink.
  I wondered why this take about 33ms, network, cpu, memory both have no
pressure. In theory my network can work at 100MB/s, but flume only take
about  60MB/s.
  How can I resolve this problem?
  Thanks a lot. Below is my configure and my test result

# SOURCE
a1.sources = r1
a1.sinks =  k1
a1.channels = c1

a1.sources.r1.type = avro
a1.sources.r1.bind = 10.0.2.13
a1.sources.r1.port = 9876
a1.sources.r1.threads = 10

# SINK (HDFS)
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.filePrefix = packet
a1.sinks.k1.hdfs.batchSize= 5000
a1.sinks.k1.hdfs.fileSuffix = .snappy
a1.sinks.k1.hdfs.codeC = snappy
a1.sinks.k1.hdfs.fileType = CompressedStream
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollSize = 500000000
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.path = ....

# INTERCEPTORS (TIMESTAMP FOR HDFS PATH)
a1.sources.r1.interceptors = i1 i2
a1.sources.r1.interceptors.i1.type = timestamp
a1.sources.r1.interceptors.i2.type = host
a1.sources.r1.interceptors.i2.preserveExisting = false
a1.sources.r1.interceptors.i2.hostHeader = test-1

# CHANNEL (MEM),take max 1g memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 5000
a1.channels.c1.byteCapacity = 1000000000
## bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

my client command: time flume-ng avro-client -c /etc/flume-ng/conf -P
/tmp/rpcProps -F /tmp/flume/tmp/test-1.tmp -Xmx1024m -Xms1024m -Xmn800m
-Xss512k
command result:real 0m33.061s user 0m13.689s sys 0m5.504s

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB