Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> gauging cost of region movement

Copy link to this message
RE: gauging cost of region movement
This is an interesting direction, and definitely file a JIRA as this could be an additional metric in the future, but it's not exactly what I had in mind.

One of the hardest parts of load balancing based on request count and other dynamic/transient measures is that you can get some pretty pathological conditions where you are always moving stuff around.

To guard against it, I think we'll need to move to more of a cost-based algorithm that is taking not just the difference in request counts into account but also a baseline "cost" of moving a region.  The cost difference in load between two unbalanced servers would have to outweigh the cost associated with moving a region.  As you say, looking at the number of live operations to a given region could contribute to the cost of moving that region, but the best measure for that is probably just looking at request count (it's all requests that incur a cost, not just active scanners).


> -----Original Message-----
> From: Ted Yu [mailto:[EMAIL PROTECTED]]
> Sent: Monday, March 21, 2011 3:44 PM
> Subject: gauging cost of region movement
> Can we add a counter for the number of InternalScanner's to HRegion ?
> We decrement this counter when close() is called.
> Such counter can be used to gauge the cost of moving the underlying region.
> Cheers