Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Memory Consumption and Processing questions


Copy link to this message
-
Memory Consumption and Processing questions
Hello all,

I'm planning an hbase implementation and had some questions I was hoping
someone could help with.

1. Can someone give me a basic overview of how memory is used in Hbase?
 Various places on the web people state that 16-24gb is the minimum for
region servers if they also operate as hdfs/mr nodes.  Assuming that hdfs/mr
nodes consume ~8gb that leaves a "minimum" of 8-16gb for hbase.  It seems
like lots of people suggesting use of even 24gb+ for hbase.  Why so much?
 Is it simply to avoid gc problems?  Have data in memory for fast random
reads? Or?

2. What types of things put more/less pressure on memory?  I saw insinuation
that insert speed can create substantial memory pressure.  What type of
relative memory pressure do scanners, random reads, random writes, region
quantity and compactions cause?

2. How cpu intensive are the region servers?  It seems like most of their
performance is based on i/o.  (I've noted the caution in starving region
servers of cycles--which seems primarily focused on avoiding zk timeout >
region reassignment problems.)  Does anyone suggest or recommend against
dedicating only one or two cores to a region server?  Do individual
compactions benefit from multiple cores are they single-threaded?

3. What are the memory and cpu resource demands of the master server?  It
seems like more and more of that load is moving to zk.

4. General HDFS question-- when the namenode dies, what happens to the
datanodes and how does that relate to Hbase?  E.g., can hbase continue to
operate in a read-only mode (assuming no datanode/regionserver failures post
namenode failure)?

Thanks for your help,
Jacques