Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> HBase read perfomnance and HBase client


+
Vladimir Rodionov 2013-07-30, 18:23
+
Ted Yu 2013-07-30, 18:25
+
Stack 2013-07-30, 19:32
+
lars hofhansl 2013-07-30, 20:14
+
Jean-Daniel Cryans 2013-07-30, 19:06
+
Vladimir Rodionov 2013-07-30, 19:16
+
Jean-Daniel Cryans 2013-07-30, 19:31
+
Vladimir Rodionov 2013-07-30, 20:15
+
Jean-Daniel Cryans 2013-07-30, 20:35
+
Vladimir Rodionov 2013-07-30, 20:52
+
Vladimir Rodionov 2013-07-30, 20:58
+
Ted Yu 2013-07-30, 21:01
+
Vladimir Rodionov 2013-07-30, 20:17
+
Vladimir Rodionov 2013-07-30, 20:22
+
Vladimir Rodionov 2013-07-30, 20:30
+
lars hofhansl 2013-07-30, 20:50
+
Vladimir Rodionov 2013-07-30, 21:01
+
Vladimir Rodionov 2013-08-01, 02:27
+
lars hofhansl 2013-08-01, 04:33
+
Vladimir Rodionov 2013-08-01, 04:57
+
lars hofhansl 2013-08-01, 06:15
+
Varun Sharma 2013-08-01, 06:37
+
Vladimir Rodionov 2013-08-01, 16:24
+
Ted Yu 2013-08-01, 16:27
+
Vladimir Rodionov 2013-08-01, 17:11
Copy link to this message
-
Re: HBase read perfomnance and HBase client
Network? 1GbE or 10GbE?

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jul 31, 2013, at 9:27 PM, "Vladimir Rodionov" <[EMAIL PROTECTED]> wrote:

> Some final numbers :
>
> Test config:
>
> HBase 0.94.6
> blockcache=true, block size = 64K, KV size = 62 bytes (raw).
>
> 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> 1 RS Server: the same config.
>
> Local network with ping between hosts: 0.1 ms
>
>
> 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
> threads, IO pool size and other settings.
> 2. HBase server was able to sustain 170K per sec (with 64K block size). All
> from block cache. KV size = 62 bytes (very small). This is for single Get
> op, 60 threads per client, 5 clients (on different hosts)
> 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
> tested: 30, 100. The same performance absolutely as with batch size = 1.
> Multi get has some internal issues on RegionServer side. May be excessive
> locking or some thing else.
>
>
>
>
>
> On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> <[EMAIL PROTECTED]>wrote:
>
>> 1. SCR are enabled
>> 2. Single Configuration for all table did not work well, but I will try it
>> again
>> 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
>>
>>
>> On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>>
>>> With Nagle's you'd see something around 40ms. You are not saying 0.8ms
>>> RTT is bad, right? Are you seeing ~40ms latencies?
>>>
>>> This thread has gotten confusing.
>>>
>>> I would try these:
>>> * one Configuration for all tables. Or even use a single
>>> HConnection/Threadpool and use the HTable(byte[], HConnection,
>>> ExecutorService) constructor
>>> * disable Nagle's: set both ipc.server.tcpnodelay and
>>> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client *and*
>>> server)
>>> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
>>> * enable short circuit reads (details depend on exact version of Hadoop).
>>> Google will help :)
>>>
>>> -- Lars
>>>
>>>
>>> ----- Original Message -----
>>> From: Vladimir Rodionov <[EMAIL PROTECTED]>
>>> To: [EMAIL PROTECTED]
>>> Cc:
>>> Sent: Tuesday, July 30, 2013 1:30 PM
>>> Subject: Re: HBase read perfomnance and HBase client
>>>
>>> This hbase.ipc.client.tcpnodelay (default - false) explains poor single
>>> thread performance and high latency ( 0.8ms in local network)?
>>>
>>>
>>> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov
>>> <[EMAIL PROTECTED]>wrote:
>>>
>>>> One more observation: One Configuration instance per HTable gives 50%
>>>> boost as compared to single Configuration object for all HTable's - from
>>>> 20K to 30K
>>>>
>>>>
>>>> On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov <
>>> [EMAIL PROTECTED]
>>>>> wrote:
>>>>
>>>>> This thread dump has been taken when client was sending 60 requests in
>>>>> parallel (at least, in theory). There are 50 server handler threads.
>>>>>
>>>>>
>>>>> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Sure, here it is:
>>>>>>
>>>>>> http://pastebin.com/8TjyrKRT
>>>>>>
>>>>>> epoll is not only to read/write HDFS but to connect/listen to clients
>>> as
>>>>>> well?
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <
>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>> Can you show us what the thread dump looks like when the threads are
>>>>>>> BLOCKED? There aren't that many locks on the read path when reading
>>>>>>> out of the block cache, and epoll would only happen if you need to
>>> hit
>>>>>>> HDFS, which you're saying is not happening.
>>>>>>>
>>>>>>> J-D
>>>>>>>
>>>>>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
>>>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>>>> I am hitting data in a block cache, of course. The data set is very
>>>>>>> small
>>>>>>>> to fit comfortably into block cache and all request are directed to
+
Vladimir Rodionov 2013-08-01, 18:10
+
Michael Segel 2013-08-01, 19:10
+
Vladimir Rodionov 2013-08-01, 20:25