Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Too many regions


Copy link to this message
-
Re: Too many regions
It can be reasonable to turn off the automatic region split if you know
your rowkey distribution well and you're able to ensure a great parallelism
among your regionservers "easily". (ie: manually or through HBase API).
Sometimes it's even the best solution to ensure the minimum number of
regions (Many companies are doing this). There is an example about
pre-splitting regions on the Reference Guide.

About your region size, consider upgrading it to 2 GB or even more will
help to reduce the number of regions and storeFiles.

On Fri, Jul 13, 2012 at 10:31 PM, Rob Roland <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> The HBase instance I'm managing has grown to the point that it has way too
> many regions per server - 5 region servers with 1010 regions each on HBase
> 0.90.4-cdh3u2.  I want to bring this region count under control. The
> cluster is currently running with the default region size of 256 mb, and
> the data is spread across 17 tables.   I've turned on compression for all
> the column families, which is great, as my region count is growing much
> slower now. I've looked through HDFS at the individual regions, and they
> seem rather small - 40-50 mb - which is not surprising due to major
> compactions after enabling compression.  My total hbase folder size in HDFS
> (hadoop fs -dus /hbase) is 926,939,501,499 bytes.
>
> My question is - what's the best strategy for handling this?
>
> What I assume from reading the docs:
>
> 1. Increase the hbase.hregion.max.filesize to something more reasonable,
> like 2 GB.
> 2. Bring the cluster offline and merge regions.
>
> Is there a good way to determine the actual region sizes, other than
> manually, that way I can do the merges to end up with the most efficient
> regions, size-wise?
>
> At what point is it a good idea to turn off automatic region splits and
> manually manage them?
>
> Thanks,
>
> Rob Roland
> Senior Software Engineer
> Simply Measured, Inc.
>

--
Adrien Mogenet
06.59.16.64.22
http://www.mogenet.me