Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Improving HBase read performance (based on YCSB)


Copy link to this message
-
Re: Improving HBase read performance (based on YCSB)
Hi Bharath,

What does "iostat -dmx 5" say while you're running the benchmark? Let
it print out 10 or 15 lines and copy-paste here.

How do you know the disks have unused bandwidth? Sounds like they're
just bottlenecked on seeks.
Some upcoming work in 0.94 should give you a good boost here (Dhruba's
work to do checksumming at the HBase level)

-Todd

On Mon, Feb 13, 2012 at 8:43 PM, Bharath Ravi <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I have a distributed HBase setup, on which I'm running the
> YCSB<https://github.com/brianfrankcooper/YCSB/wiki/running-a-workload>benchmark.
> There are 5 region servers, each a Dual core with around 4GB of memory,
> connected simply by a 1Gbps ethernet switch.
>
> The number of "handlers" per regionserver is set to 500 (!) and HDFS's
> maximum receivers per datanode is 4096.
>
> The benchmark dataset is large enough not to fit in memory.
> Update/Insert/Write throughput goes up to 8000 ops/sec easily.
> However, I see read latencies in the order of seconds, and read throughputs
> of only a few 100 ops per second.
>
> "Top" tells me that the CPU's on regionservers spend 70-80% of their time
> waiting for IO, while disk and network
> have plenty of unused bandwidth. How could I diagnose where the read
> bottleneck is?
>
> Any help would be greatly appreciated :)
>
> Thanks in advance!
> --
> Bharath Ravi

--
Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB