Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase - Performance issue


Copy link to this message
-
Re: HBase - Performance issue
Hi
           How many request handlers are there in ur RS?  Can you up this
number and see?

-Anoop-
On Wed, Apr 24, 2013 at 3:42 PM, kzurek <[EMAIL PROTECTED]> wrote:

> The problem is that when I'm putting my data (multithreaded client, ~30MB/s
> traffic outgoing) into the cluster the load is equally spread over all
> RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When
> I've added similar, mutlithreaded client that Scans for, let say, 100 last
> samples of randomly generated key from chosen time range, I'm getting high
> CPU wait time (20% and up) on two (or more if there is higher number of
> threads, default 10) random RegionServers. Therefore, machines that held
> those RS are getting very hot - one of the consequences is that number of
> store file is constantly increasing, up to the maximum limit. Rest of the
> RS
> are having 10-12% CPU wait time and everything seems to be OK (number of
> store files varies so they are being compacted and not increasing over
> time). Any ideas? Maybe  I could prioritize writes over reads somehow? Is
> it
> possible? If so what would be the best way to that and where it should be
> placed - on the client or cluster side)?
>
> Cluster specification:
> HBase Version   0.94.2-cdh4.2.0
> Hadoop Version  2.0.0-cdh4.2.0
> There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes
> Other settings:
>  - Bloom filters (ROWCOL) set
>  - Short circuit turned on
>  - HDFS Block Size: 128MB
>  - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB
>  - Java Heap Size of HBase RegionServer in Bytes: 12 GiB
>  - Java Heap Size of HBase Master in Bytes: 4 GiB
>  - Java Heap Size of DataNode in Bytes: 1 GiB (default)
> Number of regions per RegionServer: 19 (total 114 regions on 6 RS)
> Key design: <UUID><TIMESTAMP> -> UUID: 1-10M, TIMESTAMP: 1-N
> Table design: 1 column family with 20 columns of 8 bytes
>
> Get client:
> Multiple threads
> Each thread have its own tables instance with their Scanner.
> Each thread have its own range of UUIDs and randomly draws beginning of
> time
> range to build rowkey properly (see above).
> Each time Scan requests same amount of rows, but with random rowkey.
>
>
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html
> Sent from the HBase User mailing list archive at Nabble.com.
>