How did the nodes crash? I am asking because it would be good to know
where it hurts. As to your 6500 regions per region server, that is an
order of magnitude high than we like to see. With that many regions you
are going to run into a few issues:
1.) Small flushes due to memstore being split between too many regions
2.) Too many compactions due to your small flushes
3.) Huge storefile indexes due to high storefile count
Typically we like to keep the regions to a couple hundred max for optimal
performance. There is not a set max, but the max is until your performance
degrades and cluster dies. As for recovering from a dead cluster...If you
can truncate your data I would recommend moving to a 10GB region size(not
ideal for .90), this way once you upgrade to .92(CDH4) you will be able to
take advantage of the large region sizes without merging regions. If you
can't truncate your table still move to 10GB region
sizes(hbase.hregion.max.filesize), then merge your regions down until you
are at a sane region count.
On Mon, Jan 28, 2013 at 1:41 AM, James Chang <[EMAIL PROTECTED]>wrote:
> Does anyone know HBase 0.90.6(CDH3U4) has limitation about the total
> regions per region server? I cannot find any setting in HBse's
> configuration file!? or I missed something? One expert kindly provide a
> mailing thread (http://search-hadoop.com/m/cyFfl1SHnbD) but seems
> no advanced discuss...
> And when I try in my (CDH3U4) 6 nodes small cluster, the average
> number of region is 6500 per region server, when one node crash yesterday,
> there were 2 of the alive region servers's region become about 9000, then
> these two nodes become very slow then dead. Afer all, whole cluster down
> and the cluster cannot restart.
> So, my questions are:
> 1. What's the maximum number of regions per region server ?
> 2. In this incident, what's the best practice to recover the dead cluster?
> (except add more nodes, because this need more time)
> Best Regards.
> James Chang
Customer Operations Engineer, Cloudera