Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase read perfomnance and HBase client


Copy link to this message
-
Re: HBase read perfomnance and HBase client
This thread dump has been taken when client was sending 60 requests in
parallel (at least, in theory). There are 50 server handler threads.
On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov
<[EMAIL PROTECTED]>wrote:

> Sure, here it is:
>
> http://pastebin.com/8TjyrKRT
>
> epoll is not only to read/write HDFS but to connect/listen to clients as
> well?
>
>
> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>
>> Can you show us what the thread dump looks like when the threads are
>> BLOCKED? There aren't that many locks on the read path when reading
>> out of the block cache, and epoll would only happen if you need to hit
>> HDFS, which you're saying is not happening.
>>
>> J-D
>>
>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
>> <[EMAIL PROTECTED]> wrote:
>> > I am hitting data in a block cache, of course. The data set is very
>> small
>> > to fit comfortably into block cache and all request are directed to the
>> > same Region to guarantee single RS testing.
>> >
>> > To Ted:
>> >
>> > Yes, its CDH 4.3 . What the difference between 94.10 and 94.6 with
>> respect
>> > to read performance?
>> >
>> >
>> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans <
>> [EMAIL PROTECTED]>wrote:
>> >
>> >> That's a tough one.
>> >>
>> >> One thing that comes to mind is socket reuse. It used to come up more
>> >> more often but this is an issue that people hit when doing loads of
>> >> random reads. Try enabling tcp_tw_recycle but I'm not guaranteeing
>> >> anything :)
>> >>
>> >> Also if you _just_ want to saturate something, be it CPU or network,
>> >> wouldn't it be better to hit data only in the block cache? This way it
>> >> has the lowest overhead?
>> >>
>> >> Last thing I wanted to mention is that yes, the client doesn't scale
>> >> very well. I would suggest you give the asynchbase client a run.
>> >>
>> >> J-D
>> >>
>> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > I have been doing quite extensive testing of different read
>> scenarios:
>> >> >
>> >> > 1. blockcache disabled/enabled
>> >> > 2. data is local/remote (no good hdfs locality)
>> >> >
>> >> > and it turned out that that I can not saturate 1 RS using one
>> >> (comparable in CPU power and RAM) client host:
>> >> >
>> >> >  I am running client app with 60 read threads active (with multi-get)
>> >> that is going to one particular RS and
>> >> > this RS's load is 100 -150% (out of 3200% available) - it means that
>> >> load is ~5%
>> >> >
>> >> > All threads in RS are either in BLOCKED (wait) or in IN_NATIVE states
>> >> (epoll)
>> >> >
>> >> > I attribute this  to the HBase client implementation which seems to
>> be
>> >> not scalable (I am going dig into client later on today).
>> >> >
>> >> > Some numbers: The maximum what I could get from Single get (60
>> threads):
>> >> 30K per sec. Multiget gives ~ 75K (60 threads)
>> >> >
>> >> > What are my options? I want to measure the limits and I do not want
>> to
>> >> run Cluster of clients against just ONE Region Server?
>> >> >
>> >> > RS config: 96GB RAM, 16(32) CPU
>> >> > Client     : 48GB RAM   8 (16) CPU
>> >> >
>> >> > Best regards,
>> >> > Vladimir Rodionov
>> >> > Principal Platform Engineer
>> >> > Carrier IQ, www.carrieriq.com
>> >> > e-mail: [EMAIL PROTECTED]
>> >> >
>> >> >
>> >> > Confidentiality Notice:  The information contained in this message,
>> >> including any attachments hereto, may be confidential and is intended
>> to be
>> >> read only by the individual or entity to whom this message is
>> addressed. If
>> >> the reader of this message is not the intended recipient or an agent or
>> >> designee of the intended recipient, please note that any review, use,
>> >> disclosure or distribution of this message or its attachments, in any
>> form,
>> >> is strictly prohibited.  If you have received this message in error,
>> please
>> >> immediately notify the sender and/or [EMAIL PROTECTED] and
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB