Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HBase master failover


Copy link to this message
-
Re: HBase master failover
Nicolas Liochon 2013-08-06, 15:52
Thanks Julian. I've added a comment in the jira, let's continue there, it
will be easier to track later.
On Tue, Aug 6, 2013 at 5:44 PM, Julian Zhou <[EMAIL PROTECTED]> wrote:

> Thanks Nicolas. HBASE-9139: "Independent timeout configuration for rpc
> channel between cluster nodes" has been opened to track it. So do you think
> "hbase.rpc.internal.timeout" is a suitable name for this configuration?
>
> 于 8/6/2013 4:19 PM, Nicolas Liochon 写道:
>
>  Yes, it makes sense. Even a 1 minute timeout is not ideal in this case: we
>> know that the work to do server side is trivial, and we know it's
>> idempotent so we can retry. So I would to tend to use a specific setting
>> to
>> use for such operations.
>>
>> Could you please create a jira for this?
>>
>> Thanks,
>>
>> Nicolas
>>
>>
>>
>> On Tue, Aug 6, 2013 at 9:46 AM, Julian Zhou <[EMAIL PROTECTED]> wrote:
>>
>>  Hi Community,
>>> Could you help if this case makes sense for 0.94 or trunk?
>>> Default of "hbase.rpc.timeout" is 60000 ms (1 min). User sometimes
>>> increase them to a bigger value such as 600000 ms (10 mins) for many
>>> concurrent loading application from client. Some user share the same
>>> hbase-site.xml for both client and server. HRegionServer
>>> #tryRegionServerReport via rpc channel to report to live master, but
>>> there was a window for master failover scenario. That region server
>>> attemping to connect to master, which was just killed, backup master
>>> took the ative role immediately and put to /hbase/master, but region
>>> server was still waiting for the rpc timeout from connecting to the dead
>>> master. If "hbase.rpc.timeout" is too long, this master failover process
>>> will be long due to long rpc timeout from dead master.
>>>
>>> If so, could we seperate with 2 options, "hbase.rpc.timeout" is still
>>> for hbase client, while "hbase.rpc.internal.timeout" was for this
>>> regionserver/master rpc channel, which could be set shorted value
>>> without affect real client rpc timeout value?
>>>
>>> --
>>> Best Regards, Julian
>>>
>>>
>>>
>
> --
> Best Regards, Julian
>
>