Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Reads are very slow

Copy link to this message
Re: Reads are very slow

Hi there,

One thing to point out is that GZ compression is the tightest but slowest compression algorithm.  If speed is important, probably want to look at Snappy for your CF compression algorithm.

From: Vibhav Mundra <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Tuesday, May 21, 2013 6:21 AM
Subject: Reads are very slow

Hi All,

 I am trying to use Hbase as a key-value store, where the key is stored as row-key and the value as a column family.

The desc of the table is as follows:
 {NAME => 'cookies', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROWCOL', true
  REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'GZ', MIN_VERSIONS => '0', TTL => '214748364
 7', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '8192', IN_MEMORY => 'false', ENCODE_ON_DISK => 'true',
  BLOCKCACHE => 'true'}]}

I am accessing the values using rest API.

curl -H "Accept:application/octet-stream"

This is bound to return a single result, since it used as key-value pair.

Now the strange part is sometimes my request takes 40 millisecs to answer.
I am firing some 50000 request per second. Is there something I can do to optimize. Basically I am looking at returning the results in 5 millisecs.

The hbase configuration is as:
1 node--> running hbase master and hadoop-namenode/secondary node and hbase rest client.
2 nodes--> running data-nodes and regionservers.

The 2 nodes that are running region-servers/datanodes are having 4 ssd's each and the data is spread across all the four ssd's.
Each node has a 12 GB RAM.

Also I find this very strange, there is negligible disk I/O, CPU usages on datanodes.

I have tried various combinations, but none of them have given any results. I am attaching the hbase-site.xml. If required I will also attach the hadoop configuration files also.