Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase read perfomnance and HBase client

Copy link to this message
HBase read perfomnance and HBase client
I have been doing quite extensive testing of different read scenarios:

1. blockcache disabled/enabled
2. data is local/remote (no good hdfs locality)

and it turned out that that I can not saturate 1 RS using one (comparable in CPU power and RAM) client host:

 I am running client app with 60 read threads active (with multi-get) that is going to one particular RS and
this RS's load is 100 -150% (out of 3200% available) - it means that load is ~5%

All threads in RS are either in BLOCKED (wait) or in IN_NATIVE states (epoll)

I attribute this  to the HBase client implementation which seems to be not scalable (I am going dig into client later on today).

Some numbers: The maximum what I could get from Single get (60 threads): 30K per sec. Multiget gives ~ 75K (60 threads)

What are my options? I want to measure the limits and I do not want to run Cluster of clients against just ONE Region Server?

RS config: 96GB RAM, 16(32) CPU
Client     : 48GB RAM   8 (16) CPU

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or [EMAIL PROTECTED] and delete or destroy any copy of this message and its attachments.