Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> NN Memory Jumps every 1 1/2 hours


+
Edward Capriolo 2012-12-22, 04:24
+
Adam Faris 2012-12-22, 04:59
+
Edward Capriolo 2012-12-22, 12:54
+
Michael Segel 2012-12-22, 15:42
+
Joep Rottinghuis 2012-12-22, 17:17
+
Edward Capriolo 2012-12-22, 17:51
+
Suresh Srinivas 2012-12-22, 18:32
+
Edward Capriolo 2012-12-23, 00:03
+
Edward Capriolo 2012-12-23, 01:59
+
Suresh Srinivas 2012-12-23, 03:23
+
Edward Capriolo 2012-12-23, 18:34
+
Joep Rottinghuis 2012-12-23, 19:00
+
Suresh Srinivas 2012-12-24, 02:40
+
Edward Capriolo 2012-12-27, 21:48
+
Suresh Srinivas 2012-12-27, 22:08
+
Edward Capriolo 2012-12-27, 22:22
+
Suresh Srinivas 2012-12-27, 22:41
Copy link to this message
-
Re: NN Memory Jumps every 1 1/2 hours
I tried your suggested setting and forced GC from Jconsole and once it
crept up nothing was freeing up.

So just food for thought:

You said "average file name size is 32 bytes". Well most of my data sits in

/user/hive/warehouse/
Then I have a tables with partitions.

Does it make sense to just move this to "/u/h/w"?

Will I be saving 400,000,000 bytes of memory if I do?
On Thu, Dec 27, 2012 at 5:41 PM, Suresh Srinivas <[EMAIL PROTECTED]>wrote:

> I do not follow what you mean here.
>
> > Even when I forced a GC it cleared 0% memory.
> Is this with new younggen setting? Because earlier, based on the
> calculation I posted, you need ~11G in old generation. With 6G as the
> default younggen size, you actually had just enough memory to fit the
> namespace in oldgen. Hence you might not have seen Full GC freeing up
> enough memory.
>
> Have you tried Full GC with 1G youngen size have you tried this? I supsect
> you would see lot more memory freeing up.
>
> > One would think that since the entire NameNode image is stored in memory
> that the heap would not need to grow beyond that
> Namenode image that you see during checkpointing is the size of file
> written after serializing file system namespace in memory. This is not what
> is directly stored in namenode memory. Namenode stores data structures that
> corresponds to file system directory tree and block locations. Out of this
> only file system directory is serialized and written to fsimage. Blocks
> locations are not.
>
>
>
>
> On Thu, Dec 27, 2012 at 2:22 PM, Edward Capriolo <[EMAIL PROTECTED]
> >wrote:
>
> > I am not sure GC had a factor. Even when I forced a GC it cleared 0%
> > memory. One would think that since the entire NameNode image is stored in
> > memory that the heap would not need to grow beyond that, but that sure
> does
> > not seem to be the case. a 5GB image starts off using 10GB of memory and
> > after "burn in" it seems to use about 15GB memory.
> >
> > So really we say "the name node data has to fit in memory" but what we
> > really mean is "the name node data must fit in memory 3x"
> >
> > On Thu, Dec 27, 2012 at 5:08 PM, Suresh Srinivas <[EMAIL PROTECTED]
> > >wrote:
> >
> > > You did free up lot of old generation with reducing young generation,
> > > right? The extra 5G of RAM for the old generation should have helped.
> > >
> > > Based on my calculation, for the current number of objects you have,
> you
> > > need roughly:
> > > 12G of total heap with young generation size of 1G. This assumes the
> > > average file name size is 32 bytes.
> > >
> > > In later releases (>= 0.20.204), several memory optimization and
> startup
> > > optimizations have been done. It should help you as well.
> > >
> > >
> > >
> > > On Thu, Dec 27, 2012 at 1:48 PM, Edward Capriolo <
> [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > So it turns out the issue was just the size of the filesystem.
> > > > 2012-12-27 16:37:22,390 WARN
> > > > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint
> > > done.
> > > > New Image Size: 4,354,340,042
> > > >
> > > > Basically if the NN image size hits ~ 5,000,000,000 you get f'ed. So
> > you
> > > > need about 3x ram as your FSImage size. If you do not have enough you
> > > die a
> > > > slow death.
> > > >
> > > > On Sun, Dec 23, 2012 at 9:40 PM, Suresh Srinivas <
> > [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > Do not have access to my computer. Based on reading the previous
> > > email, I
> > > > > do not see any thing suspicious on the list of objects in the histo
> > > live
> > > > > dump.
> > > > >
> > > > > I would like to hear from you about if it continued to grow. One
> > > instance
> > > > > of this I had seen in the past was related to weak reference
> related
> > to
> > > > > socket objects.  I do not see that happening here though.
> > > > >
> > > > > Sent from phone
> > > > >
> > > > > On Dec 23, 2012, at 10:34 AM, Edward Capriolo <
> [EMAIL PROTECTED]
> > >
> > > > > wrote:
> > > > >
+
Suresh Srinivas 2012-12-27, 23:12