Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - assignment - is master beeing a watchdog useful?


+
Nicolas Liochon 2012-12-05, 19:48
+
Andrew Purtell 2012-12-06, 01:20
+
Jimmy Xiang 2012-12-06, 02:57
+
Stack 2012-12-06, 03:53
+
Jimmy Xiang 2012-12-06, 17:26
+
Sergey Shelukhin 2012-12-06, 18:18
+
Jimmy Xiang 2012-12-06, 18:35
+
Stack 2012-12-06, 18:34
+
Jimmy Xiang 2012-12-06, 18:39
+
Stack 2012-12-06, 18:44
+
Jimmy Xiang 2012-12-06, 18:55
Copy link to this message
-
Re: assignment - is master beeing a watchdog useful?
Andrew Purtell 2012-12-06, 03:53
10 minutes is still too long to be useful IMO.

On 12/6/12, Jimmy Xiang <[EMAIL PROTECTED]> wrote:
> Ideally, we don't need a watch dog.  If we open a region on a region
> server, the region will be opened there quickly.  If the region server
> dies in the middle, ServerShutdownHandler will take care of it.
>
> If this region server happens to be hot, it may take a while to open
> it.  If we don't time it out, the server may be even hotter.  If the
> region server could not open it here, other region servers may not be
> able to open it either.
>
> By the way, currently, the timeout interval is 10 minutes.
>
> If it is ok for the hot region server issue, I don't see why we can't
> remove it, right now.
>
> Thanks,
> Jimmy
>
> On Wed, Dec 5, 2012 at 5:20 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>> My information here may be stale.
>>
>> I remember we increased the timeout interval from 3 to 30 minutes,
>> because
>> the master injecting itself into mid-assignment often triggered races and
>> led to double assignments and other bad stuff. At 30 minutes, this is not
>> useful IMO. As an operator I'd run hbck to sort it out long before then.
>>
>>
>> On Thursday, December 6, 2012, Nicolas Liochon wrote:
>>
>>> See comments in HBASE-7247: the master checks the time spent by the
>>> regionserver, and assign it to another if it takes too long. It adds
>>> complexity.
>>>
>>> from Stack: "I'm currently of the opinion that this expensive facility
>>> of
>>> master failing an open because it has been taking too long on a
>>> particular
>>> regionserver has been of no use – worse, it has only caused headache –
>>> but
>>> I may be just not remembering and others out on dev list will have
>>> better
>>> recall than I."
>>>
>>> So, opinions & memories are more than welcome.
>>> Removing this feature would be a huge simplification!
>>>
>>> Cheers,
>>>
>>> Nicolas
>>>
>>
>>
>> --
>> Best regards,
>>
>>    - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>