Ideally, we don't need a watch dog. If we open a region on a region
server, the region will be opened there quickly. If the region server
dies in the middle, ServerShutdownHandler will take care of it.
If this region server happens to be hot, it may take a while to open
it. If we don't time it out, the server may be even hotter. If the
region server could not open it here, other region servers may not be
able to open it either.
By the way, currently, the timeout interval is 10 minutes.
If it is ok for the hot region server issue, I don't see why we can't
remove it, right now.
On Wed, Dec 5, 2012 at 5:20 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> My information here may be stale.
> I remember we increased the timeout interval from 3 to 30 minutes, because
> the master injecting itself into mid-assignment often triggered races and
> led to double assignments and other bad stuff. At 30 minutes, this is not
> useful IMO. As an operator I'd run hbck to sort it out long before then.
> On Thursday, December 6, 2012, Nicolas Liochon wrote:
>> See comments in HBASE-7247: the master checks the time spent by the
>> regionserver, and assign it to another if it takes too long. It adds
>> from Stack: "I'm currently of the opinion that this expensive facility of
>> master failing an open because it has been taking too long on a particular
>> regionserver has been of no use – worse, it has only caused headache – but
>> I may be just not remembering and others out on dev list will have better
>> recall than I."
>> So, opinions & memories are more than welcome.
>> Removing this feature would be a huge simplification!
> Best regards,
> - Andy
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)