Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume uses high Virtual memory


Copy link to this message
-
Re: Flume uses high Virtual memory
Additionally I'd note that worrying about virtual memory on 64 bit machines
is probably not worth your time. The newer versions of malloc() do arena
allocation and reserve virtual memory for each thread.  This does not
however, actually consume memory.
On Sat, Dec 14, 2013 at 10:49 AM, Matt Wise <[EMAIL PROTECTED]> wrote:

> We ran into an issue just like this when we did not limit our source
> 'thread' counts. The Avro source seems to spawn potentially thousands of
> threads if you don't limit it:
>
> a1.sources.r1.threads = 50
>
> (you can validate this with 'htop')
>
> Matt Wise
> Sr. Systems Architect
>  Nextdoor.com
>
>
> On Fri, Dec 13, 2013 at 2:58 PM, shibi S <[EMAIL PROTECTED]> wrote:
>
>>
>> Flume Agent that is writing to HDFS is high on virtual memory usage
>> (15.6g).  Agent writes to 3 different directories in HDFS based on type of
>> data that is received. Configuration is given below. Any idea why VM usage
>> is high?  I see high VM usage only on the Agents that is writing to HDFS.
>> Other Agents are low in VM usage.
>>
>> Flume version : apache-flume-1.4.0 (I tested with 1.5 version as well).
>>
>> * PID      USER         PR  NI   VIRT    RES       SHR   S  %CPU %MEM
>> TIME+          COMMAND        *
>>
>> 38663  deploy      20   0    15.6g  576m   15m  S   2.6
>> 0.2         225:19.29    java
>>
>> *Configuration:*
>> a1.sources.r1.selector.type = multiplexing
>> a1.sources.r1.selector.header = header1
>> a1.sources.r1.selector.mapping.red_cancel = c1
>>
>>
>> *Source Configuration:*a1.sources.r1.type = avro
>> a1.sources.r1.bind = 0.0.0.0
>> a1.sources.r1.port = 60000
>>
>> *Sink configuration:*
>> a1.sinks.k1.type=hdfs
>> a1.sinks.k1.hdfs.path=hdfs://<HDFS PATH>/%Y/%m/%d/%H
>> a1.sinks.k1.hdfs.fileType = DataStream
>> a1.sinks.k1.hdfs.filePrefix = filetype1-
>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>> #a1.sinks.k1.hdfs.txnEventMax = 40000
>> a1.sinks.k1.hdfs.rollInterval = 10
>> a1.sinks.k2.hdfs.roundUnit = minute
>> a1.sinks.k1.hdfs.rollSize = 0
>> a1.sinks.k1.hdfs.rollCount = 500
>> a1.sinks.k1.hdfs.batchSize = 500
>> a1.sinks.k1.hdfs.idleTimeout =0
>> a1.sinks.k1.hdfs.maxOpenFiles = 1000
>>
>> *Channel configuration:*
>> a1.channels.c2.type=file
>> a1.channels.c2.checkpointDir =/x/home/deploy/flume/checkpoint2
>> a1.channels.c2.dataDirs = /x/home/deploy/flume/data2
>>
>>
>>
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB