-Re: HBaseClient recovery from .META. server power down
Suraj Varma 2012-07-10, 16:46
I will create a JIRA ticket ...
The only side-effect I could think of is ... if a RS is having a GC of
a few seconds, any _new_ client trying to connect would get connect
failures. So ... the _initial_ connection to the RS is what would
suffer from a super-low setting of the ipc.socket.timeout. This was my
read of the code.
So - was hoping to get a confirmation if this is the only side effect.
Again - this is on the client side - I wouldn't risk doing this on the
cluster side ...
On Mon, Jul 9, 2012 at 9:44 AM, N Keywal <[EMAIL PROTECTED]> wrote:
> What you're describing -the 35 minutes recovery time- seems to match
> the code. And it's a bug (still there on trunk). Could you please
> create a jira for it? If you have the logs it even better.
> Lowering the ipc.socket.timeout seems to be an acceptable partial
> workaround. Setting it to 10s seems ok to me. Lower than this... I
> don't know.
> On Mon, Jul 9, 2012 at 6:16 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>> I'd like to get advice on the below strategy of decreasing the
>> "ipc.socket.timeout" configuration on the HBase Client side ... has
>> anyone tried this? Has anyone had any issues with configuring this
>> lower than the default 20s?
>> On Mon, Jul 2, 2012 at 5:51 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>>> By "power down" below, I mean powering down the host with the RS that
>>> holds the .META. table. (So - essentially, the host IP is unreachable
>>> and the RS/DN is gone.)
>>> Just wanted to clarify my below steps ...
>>> On Mon, Jul 2, 2012 at 5:36 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>>>> We've been doing some failure scenario tests by powering down a .META.
>>>> holding region server host and while the HBase cluster itself recovers
>>>> and reassigns the META region and other regions (after we tweaked down
>>>> the default timeouts), our client apps using HBaseClient take a long
>>>> time to recover.
>>>> hbase-0.90.6 / cdh3u4 / JDK 1.6.0_23
>>>> 1) Apply load via client app on HBase cluster for several minutes
>>>> 2) Power down the region server holding the .META. server
>>>> 3) Measure how long it takes for cluster to reassign META table and
>>>> for client threads to re-lookup and re-orient to the lesser cluster
>>>> (minus the RS and DN on that host).
>>>> What we see:
>>>> 1) Client threads spike up to maxThread size ... and take over 35 mins
>>>> to recover (i.e. for the thread count to go back to normal) - no calls
>>>> are being serviced - they are all just backed up on a synchronized
>>>> method ...
>>>> 2) Essentially, all the client app threads queue up behind the
>>>> HBaseClient.setupIOStreams method in oahh.ipc.HBaseClient
>>>> After taking several thread dumps we found that the thread within this
>>>> synchronized method was blocked on
>>>> NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf));
>>>> Essentially, the thread which got the lock would try to connect to the
>>>> dead RS (till socket times out), retrying, and then the next thread
>>>> gets in and so forth.
>>>> Solution tested:
>>>> So - the ipc.HBaseClient code shows ipc.socket.timeout default is 20s.
>>>> We dropped this down to a low number (1000 ms, 100 ms, etc) and the
>>>> recovery was much faster (in a couple of minutes).
>>>> So - we're thinking of setting the HBase client side hbase-site.xml
>>>> with an ipc.socket.timeout of 100ms. Looking at the code, it appears
>>>> that this is only ever used during the initial "HConnection" setup via
>>>> the NetUtils.connect and should only ever be used when connectivity to
>>>> a region server is lost and needs to be re-established. i.e it does
>>>> not affect the normal "RPC" actiivity as this is just the connect