-HBase master failover
Julian Zhou 2013-08-06, 07:46
Could you help if this case makes sense for 0.94 or trunk?
Default of "hbase.rpc.timeout" is 60000 ms (1 min). User sometimes
increase them to a bigger value such as 600000 ms (10 mins) for many
concurrent loading application from client. Some user share the same
hbase-site.xml for both client and server. HRegionServer
#tryRegionServerReport via rpc channel to report to live master, but
there was a window for master failover scenario. That region server
attemping to connect to master, which was just killed, backup master
took the ative role immediately and put to /hbase/master, but region
server was still waiting for the rpc timeout from connecting to the dead
master. If "hbase.rpc.timeout" is too long, this master failover process
will be long due to long rpc timeout from dead master.
If so, could we seperate with 2 options, "hbase.rpc.timeout" is still
for hbase client, while "hbase.rpc.internal.timeout" was for this
regionserver/master rpc channel, which could be set shorted value
without affect real client rpc timeout value?
Best Regards, Julian