Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HBase read perfomnance and HBase client


Copy link to this message
-
Re: HBase read perfomnance and HBase client
Vladimir Rodionov 2013-07-30, 20:17
This thread dump has been taken when client was sending 60 requests in
parallel (at least, in theory). There are 50 server handler threads.
On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov
<[EMAIL PROTECTED]>wrote:

> Sure, here it is:
>
> http://pastebin.com/8TjyrKRT
>
> epoll is not only to read/write HDFS but to connect/listen to clients as
> well?
>
>
> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>
>> Can you show us what the thread dump looks like when the threads are
>> BLOCKED? There aren't that many locks on the read path when reading
>> out of the block cache, and epoll would only happen if you need to hit
>> HDFS, which you're saying is not happening.
>>
>> J-D
>>
>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
>> <[EMAIL PROTECTED]> wrote:
>> > I am hitting data in a block cache, of course. The data set is very
>> small
>> > to fit comfortably into block cache and all request are directed to the
>> > same Region to guarantee single RS testing.
>> >
>> > To Ted:
>> >
>> > Yes, its CDH 4.3 . What the difference between 94.10 and 94.6 with
>> respect
>> > to read performance?
>> >
>> >
>> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans <
>> [EMAIL PROTECTED]>wrote:
>> >
>> >> That's a tough one.
>> >>
>> >> One thing that comes to mind is socket reuse. It used to come up more
>> >> more often but this is an issue that people hit when doing loads of
>> >> random reads. Try enabling tcp_tw_recycle but I'm not guaranteeing
>> >> anything :)
>> >>
>> >> Also if you _just_ want to saturate something, be it CPU or network,
>> >> wouldn't it be better to hit data only in the block cache? This way it
>> >> has the lowest overhead?
>> >>
>> >> Last thing I wanted to mention is that yes, the client doesn't scale
>> >> very well. I would suggest you give the asynchbase client a run.
>> >>
>> >> J-D
>> >>
>> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > I have been doing quite extensive testing of different read
>> scenarios:
>> >> >
>> >> > 1. blockcache disabled/enabled
>> >> > 2. data is local/remote (no good hdfs locality)
>> >> >
>> >> > and it turned out that that I can not saturate 1 RS using one
>> >> (comparable in CPU power and RAM) client host:
>> >> >
>> >> >  I am running client app with 60 read threads active (with multi-get)
>> >> that is going to one particular RS and
>> >> > this RS's load is 100 -150% (out of 3200% available) - it means that
>> >> load is ~5%
>> >> >
>> >> > All threads in RS are either in BLOCKED (wait) or in IN_NATIVE states
>> >> (epoll)
>> >> >
>> >> > I attribute this  to the HBase client implementation which seems to
>> be
>> >> not scalable (I am going dig into client later on today).
>> >> >
>> >> > Some numbers: The maximum what I could get from Single get (60
>> threads):
>> >> 30K per sec. Multiget gives ~ 75K (60 threads)
>> >> >
>> >> > What are my options? I want to measure the limits and I do not want
>> to
>> >> run Cluster of clients against just ONE Region Server?
>> >> >
>> >> > RS config: 96GB RAM, 16(32) CPU
>> >> > Client     : 48GB RAM   8 (16) CPU
>> >> >
>> >> > Best regards,
>> >> > Vladimir Rodionov
>> >> > Principal Platform Engineer
>> >> > Carrier IQ, www.carrieriq.com
>> >> > e-mail: [EMAIL PROTECTED]
>> >> >
>> >> >
>> >> > Confidentiality Notice:  The information contained in this message,
>> >> including any attachments hereto, may be confidential and is intended
>> to be
>> >> read only by the individual or entity to whom this message is
>> addressed. If
>> >> the reader of this message is not the intended recipient or an agent or
>> >> designee of the intended recipient, please note that any review, use,
>> >> disclosure or distribution of this message or its attachments, in any
>> form,
>> >> is strictly prohibited.  If you have received this message in error,
>> please
>> >> immediately notify the sender and/or [EMAIL PROTECTED] and