Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - NameNode heapsize


Copy link to this message
-
Re: NameNode heapsize
Brian Bockelman 2011-06-10, 12:22

On Jun 10, 2011, at 6:32 AM, [EMAIL PROTECTED] wrote:

> Dear all,
>
> I'm looking for ways to improve the namenode heap size usage of a 800-node 10PB testing Hadoop cluster that stores
> around 30 million files.
>
> Here's some info:
>
> 1 x namenode:     32GB RAM, 24GB heap size
> 800 x datanode:   8GB RAM, 13TB hdd
>
> *33050825 files and directories, 47708724 blocks = 80759549 total. Heap Size is 22.93 GB / 22.93 GB (100%) *
>
> From the cluster summary report, it seems the heap size usage is always full but couldn't drop, do you guys know of any ways
> to reduce it ? So far I don't see any namenode OOM errors so it looks memory assigned for the namenode process is (just)
> enough. But i'm curious which factors would account for the full use of heap size ?
>

The advice I give to folks is to plan on 1GB heap for every million objects.  It's an over-estimate, but I prefer to be on the safe side.  Why not increase the heap-size to 28GB?  Should buy you some time.

You can turn on compressed pointers, but your best bet is really going to be spending some more money on RAM.

Brian