Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Non Heap Memory Leak (HDFS sink?)


Copy link to this message
-
Re: Non Heap Memory Leak (HDFS sink?)
Turned out to be an issue in the bzip codec. switching to snappy fixed it.

-roshan
On Tue, Dec 10, 2013 at 12:04 PM, Roshan Naik <[EMAIL PROTECTED]>wrote:

> indeed. thats what i meant.
>
>
> On Tue, Dec 10, 2013 at 12:01 PM, Hari Shreedharan <
> [EMAIL PROTECTED]> wrote:
>
>> Roshan,
>>
>> This parameter is hdfs.idleTimeout and hdfs.maxOpenFiles (you need to do
>> agent.hdfsSink.hdfs.idleTimeout), thanks to some historical configuration
>> formatting.
>>
>>
>> Thanks,
>> Hari
>>
>>
>> On Tuesday, December 10, 2013 at 11:58 AM, Hari Shreedharan wrote:
>>
>> > Flume config - these are parameters for the HDFS sink.
>> >
>> >
>> > Thanks,
>> > Hari
>> >
>> >
>> > On Tuesday, December 10, 2013 at 11:54 AM, Steve Morin wrote:
>> >
>> > > Hari Hadoop config or in the flume config
>> > >
>> > > > On Dec 10, 2013, at 11:30, Hari Shreedharan <
>> [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
>> > > >
>> > > > The reason for this is the direct memory allocations by HDFS
>> codecs. Reduce your maxOpenFiles and idleTimeout to have the bucket writers
>> garbage collected regularly.
>> > > >
>> > > >
>> > > > Thanks,
>> > > > Hari
>> > > >
>> > > >
>> > > > > On Tuesday, December 10, 2013 at 11:19 AM, Roshan Naik wrote:
>> > > > >
>> > > > > Flume version: 1.4 (compiled with hadoop 2)
>> > > > > HDFS version:
>> > > > >
>> > > > > I have the following agent config:
>> > > > > - 1 avro source, (threads = 24, deflate compression)
>> > > > > - 1 file channel
>> > > > > - 4 hdfs sinks (thread pool size 2, write to a new hdfs directory
>> every 5 min, bzip2)
>> > > > >
>> > > > > Event size ~500bytes.
>> > > > > Physical RAM : 64gb
>> > > > >
>> > > > >
>> > > > > The java max heap size is capped at 8gb and the actual java heap
>> consumption on the running instance is well below that (few hundred mb).
>> > > > > However I am noticing in the 'top' output that the total virtual
>> memory size and resident set size keep steadily increasing over time (well
>> beyond 8gb). Once the total Resident set size of all the process comes
>> close to about size of physical RAM (flume consuming 95% of it), the
>> operating system nukes the flume process leaving no trace of this death in
>> the flume logs.
>> > > > >
>> > > > > Here is the first sample top output.
>> > > > >
>> > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> > > > > 15625 root 20 0 30.1g 10g 53m S 2.0 16.0 40:45.93 java
>> > > > >
>> > > > > Here is one after a few hours
>> > > > >
>> > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> > > > > 15625 root 20 0 61.3g 36g 1072 S 12.0 57.4 126:10.98 java
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > When i replace the HDFS sink with null sink, the problem goes
>> away and the process remains very stable. So Fchannel does not seem to be
>> the culprit.
>> > > > >
>> > > > > sample config is attached
>> > > > > CONFIDENTIALITY NOTICE
>> > > > > NOTICE: This message is intended for the use of the individual or
>> entity to which it is addressed and may contain information that is
>> confidential, privileged and exempt from disclosure under applicable law.
>> If the reader of this message is not the intended recipient, you are hereby
>> notified that any printing, copying, dissemination, distribution,
>> disclosure or forwarding of this communication is strictly prohibited. If
>> you have received this communication in error, please contact the sender
>> immediately and delete it from your system. Thank You.
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>>
>>
>

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB