Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> NN Memory Jumps every 1 1/2 hours


+
Edward Capriolo 2012-12-22, 04:24
+
Adam Faris 2012-12-22, 04:59
+
Edward Capriolo 2012-12-22, 12:54
+
Michael Segel 2012-12-22, 15:42
+
Joep Rottinghuis 2012-12-22, 17:17
Copy link to this message
-
Re: NN Memory Jumps every 1 1/2 hours
Newer 1.6 are getting close to 1.7 so I am not going to fear a number and
fight the future.

I have been aat around 27 million files for a while been as high as 30
million I do not think that is related.

I do not think it is related to checkpoints but I am considering
raising/lowering the checkpoint triggers.

On Saturday, December 22, 2012, Joep Rottinghuis <[EMAIL PROTECTED]>
wrote:
> Do your OOMs correlate with the secondary checkpointing?
>
> Joep
>
> Sent from my iPhone
>
> On Dec 22, 2012, at 7:42 AM, Michael Segel <[EMAIL PROTECTED]>
wrote:
>
>> Hey Silly question...
>>
>> How long have you had 27 million files?
>>
>> I mean can you correlate the number of files to the spat of OOMs?
>>
>> Even without problems... I'd say it would be a good idea to upgrade due
to the probability of a lot of code fixes...
>>
>> If you're running anything pre 1.x, going to 1.7 java wouldn't be a good
idea.  Having said that... outside of MapR, have any of the distros
certified themselves on 1.7 yet?
>>
>> On Dec 22, 2012, at 6:54 AM, Edward Capriolo <[EMAIL PROTECTED]>
wrote:
>>
>>> I will give this a go. I have actually went in JMX and manually
triggered
>>> GC no memory is returned. So I assumed something was leaking.
>>>
>>> On Fri, Dec 21, 2012 at 11:59 PM, Adam Faris <[EMAIL PROTECTED]>
wrote:
>>>
>>>> I know this will sound odd, but try reducing your heap size.   We had
an
>>>> issue like this where GC kept falling behind and we either ran out of
heap
>>>> or would be in full gc.  By reducing heap, we were forcing concurrent
mark
>>>> sweep to occur and avoided both full GC and running out of heap space
as
>>>> the JVM would collect objects more frequently.
>>>>
>>>> On Dec 21, 2012, at 8:24 PM, Edward Capriolo <[EMAIL PROTECTED]>
>>>> wrote:
>>>>
>>>>> I have an old hadoop 0.20.2 cluster. Have not had any issues for a
while.
>>>>> (which is why I never bothered an upgrade)
>>>>>
>>>>> Suddenly it OOMed last week. Now the OOMs happen periodically. We
have a
>>>>> fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
>>>>> 27,000,000 files.
>>>>>
>>>>> So the strangest thing is that every 1 and 1/2 hour the NN memory
usage
>>>>> increases until the heap is full.
>>>>>
>>>>> http://imagebin.org/240287
>>>>>
>>>>> We tried failing over the NN to another machine. We change the Java
>>>> version
>>>>> from 1.6_23 -> 1.7.0.
>>>>>
>>>>> I have set the NameNode logs to debug and ALL and I have done the same
>>>> with
>>>>> the data nodes.
>>>>> Secondary NN is running and shipping edits and making new images.
>>>>>
>>>>> I am thinking something has corrupted the NN MetaData and after enough
>>>> time
>>>>> it becomes a time bomb, but this is just a total shot in the dark.
Does
>>>>> anyone have any interesting trouble shooting ideas?
>>
>
+
Suresh Srinivas 2012-12-22, 18:32
+
Edward Capriolo 2012-12-23, 00:03
+
Edward Capriolo 2012-12-23, 01:59
+
Suresh Srinivas 2012-12-23, 03:23
+
Edward Capriolo 2012-12-23, 18:34
+
Joep Rottinghuis 2012-12-23, 19:00
+
Suresh Srinivas 2012-12-24, 02:40
+
Edward Capriolo 2012-12-27, 21:48
+
Suresh Srinivas 2012-12-27, 22:08
+
Edward Capriolo 2012-12-27, 22:22
+
Suresh Srinivas 2012-12-27, 22:41
+
Edward Capriolo 2012-12-27, 22:58
+
Suresh Srinivas 2012-12-27, 23:12
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB