Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HBase Region/Table Hotspotting


Copy link to this message
-
HBase Region/Table Hotspotting
Joarder KAMAL 2013-02-11, 02:17
This is my first email in the group. I am having a more general and
open-ended question but hope to get some reasoning from the HBase user
communities.
I am a very basic HBase user and still learning. My intention to use HBase
in one of our research project. Recently I was looking through Lars
George's book "HBase - The Definitive Guide" and two particular topics
caught my eyes. One is 'Region and Table Hotspotting' and the other is
'Region Auto-Sharding and Merging'.

*Scenario: *
If a hotspot is created in a particular region or in a table (having
multiple regions) due to sudden workload change, then one may split the
region into further small pieces and distributed it to a number of
available physical machine in the cluster. This process should require
large data transfer between different machines in the cluster and incur a
performance cost. One may also change the 'key' definition and manage the
regions. But I am not sure how effective or logical to change key designs
on a production system.

*Questions:*

   1. How often you are facing Region or Table Hotspotting in HBase
   production systems?
   2. If a hotspot is created, how quickly it is automatically cleared out
   (assuming sudden workload change)?
   3. How often this kind of situation happens - A hotspot is detected and
   vanished out before taking an action? or hotspots stays longer period of
   time?
   4. Or if the hotspot is stays, how it is handled (in general) in
   production system?
   5. How large data transfer cost is minimized or avoid for re-sharding
   regions within a cluster in a single data center or within WAN?
   6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for
   OLAP workloads and real-time analytics?
Further directions to more information about region/table hotspotting is
most welcome.

Many thanks in advance.

Regards,
Joarder Kamal