|
|
-
gauging cost of region movement
Ted Yu 2011-03-21, 22:44
Can we add a counter for the number of InternalScanner's to HRegion ? We decrement this counter when close() is called.
Such counter can be used to gauge the cost of moving the underlying region.
Cheers
-
RE: gauging cost of region movement
Jonathan Gray 2011-03-21, 23:22
This is an interesting direction, and definitely file a JIRA as this could be an additional metric in the future, but it's not exactly what I had in mind.
One of the hardest parts of load balancing based on request count and other dynamic/transient measures is that you can get some pretty pathological conditions where you are always moving stuff around.
To guard against it, I think we'll need to move to more of a cost-based algorithm that is taking not just the difference in request counts into account but also a baseline "cost" of moving a region. The cost difference in load between two unbalanced servers would have to outweigh the cost associated with moving a region. As you say, looking at the number of live operations to a given region could contribute to the cost of moving that region, but the best measure for that is probably just looking at request count (it's all requests that incur a cost, not just active scanners).
JG
> -----Original Message----- > From: Ted Yu [mailto:[EMAIL PROTECTED]] > Sent: Monday, March 21, 2011 3:44 PM > To: [EMAIL PROTECTED] > Subject: gauging cost of region movement > > Can we add a counter for the number of InternalScanner's to HRegion ? > We decrement this counter when close() is called. > > Such counter can be used to gauge the cost of moving the underlying region. > > Cheers
-
RE: gauging cost of region movement
Jonathan Gray 2011-03-21, 23:26
Also, using more stable measures of request count will help, such as 30 minute rolling averages.
> -----Original Message----- > From: Jonathan Gray [mailto:[EMAIL PROTECTED]] > Sent: Monday, March 21, 2011 4:23 PM > To: [EMAIL PROTECTED] > Subject: RE: gauging cost of region movement > > This is an interesting direction, and definitely file a JIRA as this could be an > additional metric in the future, but it's not exactly what I had in mind. > > One of the hardest parts of load balancing based on request count and other > dynamic/transient measures is that you can get some pretty pathological > conditions where you are always moving stuff around. > > To guard against it, I think we'll need to move to more of a cost-based > algorithm that is taking not just the difference in request counts into account > but also a baseline "cost" of moving a region. The cost difference in load > between two unbalanced servers would have to outweigh the cost > associated with moving a region. As you say, looking at the number of live > operations to a given region could contribute to the cost of moving that > region, but the best measure for that is probably just looking at request > count (it's all requests that incur a cost, not just active scanners). > > JG > > > -----Original Message----- > > From: Ted Yu [mailto:[EMAIL PROTECTED]] > > Sent: Monday, March 21, 2011 3:44 PM > > To: [EMAIL PROTECTED] > > Subject: gauging cost of region movement > > > > Can we add a counter for the number of InternalScanner's to HRegion ? > > We decrement this counter when close() is called. > > > > Such counter can be used to gauge the cost of moving the underlying > region. > > > > Cheers
-
Re: gauging cost of region movement
Ryan Rawson 2011-03-21, 23:32
it would make sense to avoid moving regions, so therefore the more recently a region was moved, the less likely we should move it.
you could imagine a hypothetical perfect 'region move cost' function that might look like:
F(r) = timeSinceMoved(r) + size(r) + loadAvg(r)
The functions should probably be normalized to [0,1], so the range of F would be [0,3] with 3 == 'dont move' and 0 == 'move first'.
The goal is to minimize all the F(r[i]) in the moves.
-ryan
On Mon, Mar 21, 2011 at 4:26 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > Also, using more stable measures of request count will help, such as 30 minute rolling averages. > >> -----Original Message----- >> From: Jonathan Gray [mailto:[EMAIL PROTECTED]] >> Sent: Monday, March 21, 2011 4:23 PM >> To: [EMAIL PROTECTED] >> Subject: RE: gauging cost of region movement >> >> This is an interesting direction, and definitely file a JIRA as this could be an >> additional metric in the future, but it's not exactly what I had in mind. >> >> One of the hardest parts of load balancing based on request count and other >> dynamic/transient measures is that you can get some pretty pathological >> conditions where you are always moving stuff around. >> >> To guard against it, I think we'll need to move to more of a cost-based >> algorithm that is taking not just the difference in request counts into account >> but also a baseline "cost" of moving a region. The cost difference in load >> between two unbalanced servers would have to outweigh the cost >> associated with moving a region. As you say, looking at the number of live >> operations to a given region could contribute to the cost of moving that >> region, but the best measure for that is probably just looking at request >> count (it's all requests that incur a cost, not just active scanners). >> >> JG >> >> > -----Original Message----- >> > From: Ted Yu [mailto:[EMAIL PROTECTED]] >> > Sent: Monday, March 21, 2011 3:44 PM >> > To: [EMAIL PROTECTED] >> > Subject: gauging cost of region movement >> > >> > Can we add a counter for the number of InternalScanner's to HRegion ? >> > We decrement this counter when close() is called. >> > >> > Such counter can be used to gauge the cost of moving the underlying >> region. >> > >> > Cheers >
-
Re: gauging cost of region movement
Ted Yu 2011-03-21, 23:35
I opened HBASE-3679 and pasted comments there.
Please continue on that JIRA.
On Mon, Mar 21, 2011 at 4:32 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
> it would make sense to avoid moving regions, so therefore the more > recently a region was moved, the less likely we should move it. > > you could imagine a hypothetical perfect 'region move cost' function > that might look like: > > F(r) = timeSinceMoved(r) + size(r) + loadAvg(r) > > The functions should probably be normalized to [0,1], so the range of > F would be [0,3] with 3 == 'dont move' and 0 == 'move first'. > > The goal is to minimize all the F(r[i]) in the moves. > > -ryan > > On Mon, Mar 21, 2011 at 4:26 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > > Also, using more stable measures of request count will help, such as 30 > minute rolling averages. > > > >> -----Original Message----- > >> From: Jonathan Gray [mailto:[EMAIL PROTECTED]] > >> Sent: Monday, March 21, 2011 4:23 PM > >> To: [EMAIL PROTECTED] > >> Subject: RE: gauging cost of region movement > >> > >> This is an interesting direction, and definitely file a JIRA as this > could be an > >> additional metric in the future, but it's not exactly what I had in > mind. > >> > >> One of the hardest parts of load balancing based on request count and > other > >> dynamic/transient measures is that you can get some pretty pathological > >> conditions where you are always moving stuff around. > >> > >> To guard against it, I think we'll need to move to more of a cost-based > >> algorithm that is taking not just the difference in request counts into > account > >> but also a baseline "cost" of moving a region. The cost difference in > load > >> between two unbalanced servers would have to outweigh the cost > >> associated with moving a region. As you say, looking at the number of > live > >> operations to a given region could contribute to the cost of moving that > >> region, but the best measure for that is probably just looking at > request > >> count (it's all requests that incur a cost, not just active scanners). > >> > >> JG > >> > >> > -----Original Message----- > >> > From: Ted Yu [mailto:[EMAIL PROTECTED]] > >> > Sent: Monday, March 21, 2011 3:44 PM > >> > To: [EMAIL PROTECTED] > >> > Subject: gauging cost of region movement > >> > > >> > Can we add a counter for the number of InternalScanner's to HRegion ? > >> > We decrement this counter when close() is called. > >> > > >> > Such counter can be used to gauge the cost of moving the underlying > >> region. > >> > > >> > Cheers > > >
|
|