Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> NN Memory Jumps every 1 1/2 hours


+
Edward Capriolo 2012-12-22, 04:24
+
Adam Faris 2012-12-22, 04:59
+
Edward Capriolo 2012-12-22, 12:54
Copy link to this message
-
Re: NN Memory Jumps every 1 1/2 hours
Hey Silly question...

How long have you had 27 million files?

I mean can you correlate the number of files to the spat of OOMs?

Even without problems... I'd say it would be a good idea to upgrade due to the probability of a lot of code fixes...

If you're running anything pre 1.x, going to 1.7 java wouldn't be a good idea.  Having said that... outside of MapR, have any of the distros certified themselves on 1.7 yet?

On Dec 22, 2012, at 6:54 AM, Edward Capriolo <[EMAIL PROTECTED]> wrote:

> I will give this a go. I have actually went in JMX and manually triggered
> GC no memory is returned. So I assumed something was leaking.
>
> On Fri, Dec 21, 2012 at 11:59 PM, Adam Faris <[EMAIL PROTECTED]> wrote:
>
>> I know this will sound odd, but try reducing your heap size.   We had an
>> issue like this where GC kept falling behind and we either ran out of heap
>> or would be in full gc.  By reducing heap, we were forcing concurrent mark
>> sweep to occur and avoided both full GC and running out of heap space as
>> the JVM would collect objects more frequently.
>>
>> On Dec 21, 2012, at 8:24 PM, Edward Capriolo <[EMAIL PROTECTED]>
>> wrote:
>>
>>> I have an old hadoop 0.20.2 cluster. Have not had any issues for a while.
>>> (which is why I never bothered an upgrade)
>>>
>>> Suddenly it OOMed last week. Now the OOMs happen periodically. We have a
>>> fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
>>> 27,000,000 files.
>>>
>>> So the strangest thing is that every 1 and 1/2 hour the NN memory usage
>>> increases until the heap is full.
>>>
>>> http://imagebin.org/240287
>>>
>>> We tried failing over the NN to another machine. We change the Java
>> version
>>> from 1.6_23 -> 1.7.0.
>>>
>>> I have set the NameNode logs to debug and ALL and I have done the same
>> with
>>> the data nodes.
>>> Secondary NN is running and shipping edits and making new images.
>>>
>>> I am thinking something has corrupted the NN MetaData and after enough
>> time
>>> it becomes a time bomb, but this is just a total shot in the dark. Does
>>> anyone have any interesting trouble shooting ideas?
>>
>>
+
Joep Rottinghuis 2012-12-22, 17:17
+
Edward Capriolo 2012-12-22, 17:51
+
Suresh Srinivas 2012-12-22, 18:32
+
Edward Capriolo 2012-12-23, 00:03
+
Edward Capriolo 2012-12-23, 01:59
+
Suresh Srinivas 2012-12-23, 03:23
+
Edward Capriolo 2012-12-23, 18:34
+
Joep Rottinghuis 2012-12-23, 19:00
+
Suresh Srinivas 2012-12-24, 02:40
+
Edward Capriolo 2012-12-27, 21:48
+
Suresh Srinivas 2012-12-27, 22:08
+
Edward Capriolo 2012-12-27, 22:22
+
Suresh Srinivas 2012-12-27, 22:41
+
Edward Capriolo 2012-12-27, 22:58
+
Suresh Srinivas 2012-12-27, 23:12
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB