Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> several doubts about region split?


Copy link to this message
-
several doubts about region split?
Dear all,

>From the HBase reference book, it mentions that when RegionServer splits
regions, it will offline the split region and then adds the daughter
regions to META, opens daughters on the parent's hosting RegionServer and
then reports the split to the Master.

I have a several questions:

1. What does offline means? Does it mean the region which will be splitted
is not available anymore? What happened to the read and write requests to
that region?

2. From the description, if I understand right it means that now the
RegionServer will contain two Regions (One RegionServer for both daughter
and parent regions ) instead of one RegionSever for daughter and one for
parent. If it is, what are the benefits of this approach? Hot-spot problem
is still there. Moreover, this approach will be a big problem if we use the
HBase default split approach. Suppose we bulk load data into HBase cluster,
initially every write request will be accepted by only one RegionServer.
After some write requests, the RegionServer cannot response any write
request as it reaches its disk volume threshold. Hence, some data must be
removed from one RegionSever to the other RegionServer. The question is
that why we don't do it at the region split time?

Thanks!

Yong
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB