Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> X3 slow down after moving from HBase 0.90.3 to HBase 0.92.1


Copy link to this message
-
X3 slow down after moving from HBase 0.90.3 to HBase 0.92.1
Hi,

We have changed some parameters on our 16(!) region servers : 1GB
more -Xmx, more rpc handler (from 10 to 30) longer timeout, but
nothing seems to improve the response time:

- Scans with HBase 0.92  are x3 SLOWER than with HBase 0.90.3
- A lot of simultaneous gets lead to a huge slow down of batch put &
ramdom read response time

... despite the fact that our RS CPU load is really low (10%)

Note: we have not (yet) activated MSlabs, nor direct read on HDFS.

Any idea please ? I'm really stuck on that issue.

Best regards,

Le 16/11/12 20:55, Vincent Barat a �crit :
> Hi,
>
> Right now (and previously with 0.90.3) we were using the default
> value (10).
> We are trying right now to increase to 30 to see if it is better.
>
> Thanks for your concern
>
> Le 16/11/12 18:13, Ted Yu a �crit :
>> Vincent:
>> What's the value for hbase.regionserver.handler.count ?
>>
>> I assume you keep the same value as that from 0.90.3
>>
>> Thanks
>>
>> On Fri, Nov 16, 2012 at 8:14 AM, Vincent
>> Barat<[EMAIL PROTECTED]>wrote:
>>
>>> Le 16/11/12 01:56, Stack a �crit :
>>>
>>>   On Thu, Nov 15, 2012 at 5:21 AM, Guillaume
>>> Perrot<[EMAIL PROTECTED]>
>>>> wrote:
>>>>
>>>>> It happens when several tables are being compacted and/or when
>>>>> there is
>>>>> several scanners running.
>>>>>
>>>> It happens for a particular region?  Anything you can tell
>>>> about the
>>>> server looking in your cluster monitoring?  Is it running hot?  
>>>> What
>>>> do the hbase regionserver stats in UI say?  Anything
>>>> interesting about
>>>> compaction queues or requests?
>>>>
>>> Hi, thanks for your answser Stack. I will take the lead on that
>>> thread
>>> from now on.
>>>
>>> It does not happens on any particular region. Actually, things
>>> get better
>>> now since compactions have been performed on all tables and have
>>> been
>>> stopped.
>>>
>>> Nevertheless, we face a dramatic decrease of performances
>>> (especially on
>>> random gets) of the overall cluster:
>>>
>>> Despite the fact we double our number of region servers (from 8
>>> to 16) and
>>> despite the fact that these region server CPU load are just
>>> about 10% to
>>> 30%, performances are really bad : very often an light increase
>>> of request
>>> lead to a clients locked on request, very long response time. It
>>> looks like
>>> a contention / deadlock somewhere in the HBase client and C code.
>>>
>>>
>>>
>>>> If you look at the thread dump all handlers are occupied serving
>>>> requests?  These timedout requests couldn't get into the server?
>>>>
>>> We will investigate on that and report to you.
>>>
>>>
>>>   Before the timeouts, we observe an increasing CPU load on a
>>> single region
>>>>> server and if we add region servers and wait for rebalancing,
>>>>> we always
>>>>> have the same region server causing problems like these:
>>>>>
>>>>> 2012-11-14 20:47:08,443 WARN
>>>>> org.apache.hadoop.ipc.**HBaseServer: IPC
>>>>> Server Responder, call
>>>>> multi(org.apache.hadoop.hbase.**client.MultiAction@2c3da1aa), rpc
>>>>> version=1, client version=29, methodsFingerPrint=54742778 from
>>>>> <ip>:45334: output error
>>>>> 2012-11-14 20:47:08,443 WARN
>>>>> org.apache.hadoop.ipc.**HBaseServer: IPC
>>>>> Server handler 3 on 60020 caught: java.nio.channels.**
>>>>> ClosedChannelException
>>>>> at sun.nio.ch.SocketChannelImpl.**ensureWriteOpen(**
>>>>> SocketChannelImpl.java:133)
>>>>> at
>>>>> sun.nio.ch.SocketChannelImpl.**write(SocketChannelImpl.java:**324)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.**HBaseServer.channelWrite(**
>>>>> HBaseServer.java:1653)
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.**HBaseServer$Responder.
>>>>> processResponse(HBaseServer.**java:924)
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.**HBaseServer$Responder.
>>>>> doRespond(HBaseServer.java:**1003)
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.**HBaseServer$Call.**sendResponseIfReady(
>>>>>
>>>>> HBaseServer.java:409)
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.**HBaseServer$Handler.run(**
+
Stack 2012-11-21, 05:05
+
Vincent Barat 2012-11-21, 08:23
+
Alok Singh 2012-11-21, 04:53
+
Vincent Barat 2012-11-21, 09:02
+
Vincent Barat 2012-11-21, 09:04
+
Stack 2012-11-21, 17:39
+
Vincent Barat 2012-11-21, 18:35
+
Vincent Barat 2012-11-21, 08:18