Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Performance at large number of regions/node


Copy link to this message
-
Performance at large number of regions/node
Jacob Isaac 2010-05-27, 19:09
Hi

Wanted to find the group's experience on HBase performance with increasing
number of regions/node.
Also wanted to find out if there is an optimal number of regions one should
aim for?

We are currently using

17 node HBase(0.20.4) cluster on a 20 node Hadoop(0.20.2) cluster

16G RAM per node, 4G RAM for HBase
space available for (Hadoop + HBase)  ~ 1.5T /per node

We are currently loading 2 tables each with ~100m rows resulting in
~ 4000 regions (Using the default for hbase.hregion.max.filesize=256m)
and half the number of region when we double the value
for hbase.hregion.max.filesize to 512m
Although the two runs did not differ in the time taken ~ 9hrs

With the current load we are only using 10% of the disk space available,
 full utilization would result in increased # of regions
and hence wanted to find group's experience/suggestions in this regards.

~Jacob