-Re: HBase master failover
Julian Zhou 2013-08-06, 15:44
Thanks Nicolas. HBASE-9139: "Independent timeout configuration for rpc
channel between cluster nodes" has been opened to track it. So do you
think "hbase.rpc.internal.timeout" is a suitable name for this
于 8/6/2013 4:19 PM, Nicolas Liochon 写道:
> Yes, it makes sense. Even a 1 minute timeout is not ideal in this case: we
> know that the work to do server side is trivial, and we know it's
> idempotent so we can retry. So I would to tend to use a specific setting to
> use for such operations.
> Could you please create a jira for this?
> On Tue, Aug 6, 2013 at 9:46 AM, Julian Zhou <[EMAIL PROTECTED]> wrote:
>> Hi Community,
>> Could you help if this case makes sense for 0.94 or trunk?
>> Default of "hbase.rpc.timeout" is 60000 ms (1 min). User sometimes
>> increase them to a bigger value such as 600000 ms (10 mins) for many
>> concurrent loading application from client. Some user share the same
>> hbase-site.xml for both client and server. HRegionServer
>> #tryRegionServerReport via rpc channel to report to live master, but
>> there was a window for master failover scenario. That region server
>> attemping to connect to master, which was just killed, backup master
>> took the ative role immediately and put to /hbase/master, but region
>> server was still waiting for the rpc timeout from connecting to the dead
>> master. If "hbase.rpc.timeout" is too long, this master failover process
>> will be long due to long rpc timeout from dead master.
>> If so, could we seperate with 2 options, "hbase.rpc.timeout" is still
>> for hbase client, while "hbase.rpc.internal.timeout" was for this
>> regionserver/master rpc channel, which could be set shorted value
>> without affect real client rpc timeout value?
>> Best Regards, Julian
Best Regards, Julian