-Re: Memory distribution for Hadoop/Hbase processes
Ted Yu 2013-08-07, 17:05
For your question #2, see:
On Wed, Aug 7, 2013 at 10:00 AM, Dhaval Shah <[EMAIL PROTECTED]>wrote:
> You are way underpowered. I don't think you are going to get reasonable
> performance out of this hardware with so many processes running on it
> (specially memory heavy processes like HBase), obviously severity depends
> on your use case
> I would say you can decrease memory allocation to
> namenode/datanodes/secondary namenode/hbase master/zookeeper and increase
> allocation to region servers
> From: Vimal Jain <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Wednesday, 7 August 2013 12:47 PM
> Subject: Re: Memory distribution for Hadoop/Hbase processes
> Hi Ted,
> I am using centOS.
> I could not get output of "ps aux | grep pid" as currently the hbase/hadoop
> is down in production due to some internal reasons.
> Can you please help me in figuring out memory distribution for my single
> node cluster ( pseudo-distributed mode) ?
> Currently its just 4GB RAM .Also i can try and make it up to 6 GB.
> So i have come up with following distribution :-
> Name node - 512 MB
> Data node - 1024MB
> Secondary Name node - 512 MB
> HMaster - 512 MB
> HRegion - 2048 MB
> Zookeeper - 512 MB
> So total memory allocation is 5 GB and i still have 1 GB left for OS.
> 1) So is it fine to go ahead with this configuration in production ? ( I
> am asking this because i had "long GC pause" problems in past when i did
> not change JVM memory allocation configuration in hbase-env.sh and
> hadoop-env.sh so it was taking default values . i.e. 1 GB for each of the 6
> process so total allocation was 6 GB and i had only 4 GB of RAM. After this
> i just assigned 1.5 GB to HRegion and 512 MB each to HMaster and Zookeeper
> . I forgot to change it for Hadoop processes.Also i changed kernel
> parameter vm.swappiness to 0. After this , it was working fine).
> 2) Currently i am running pseudo-distributed mode as my data size is at max
> 10-15GB at present.How easy it is to migrate from pseudo-distributed mode
> to Fully distributed mode in future if my data size increases ? ( which
> will be the case for sure ) .
> Thanks for your help . Really appreciate it .
> On Sun, Aug 4, 2013 at 8:12 PM, Kevin O'dell <[EMAIL PROTECTED]
> > My questions are :
> > 1) How this thing is working ? It is working because java can over
> > memory. You will know you are using too much memory when the kernel
> > killing processes.
> > 2) I just have one table whose size at present is about 10-15 GB , so
> > should be ideal memory distribution ? Really you should get a box with
> > memory. You can currently only hold about ~400 MB in memory.
> > On Aug 4, 2013 9:58 AM, "Ted Yu" <[EMAIL PROTECTED]> wrote:
> > > What OS are you using ?
> > >
> > > What is the output from the following command ?
> > > ps aux | grep pid
> > > where pid is the process Id for Namenode, Datanode, etc.
> > >
> > > Cheers
> > >
> > > On Sun, Aug 4, 2013 at 6:33 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
> > >
> > > > Hi,
> > > > I have configured Hbase in pseudo distributed mode with HDFS as
> > > underlying
> > > > storage.I am not using map reduce framework as of now
> > > > I have 4GB RAM.
> > > > Currently i have following distribution of memory
> > > >
> > > > Data Node,Name Node,Secondary Name Node each :1000MB(default
> > > > HADOOP_HEAPSIZE
> > > > property)
> > > >
> > > > Hmaster - 512 MB
> > > > HRegion - 1536 MB
> > > > Zookeeper - 512 MB
> > > >
> > > > So total heap allocation becomes - 5.5 GB which is absurd as my total
> > RAM
> > > > is only 4 GB , but still the setup is working fine on production. :-0
> > > >
> > > > My questions are :
> > > > 1) How this thing is working ?
> > > > 2) I just have one table whose size at present is about 10-15 GB , so