Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Poor HBase map-reduce scan performance

Copy link to this message
Re: Poor HBase map-reduce scan performance
Sorry. Haven't gotten to this, yet.

Scanning in HBase being about 3x slower than straight HDFS is in the right ballpark, though. It has to a bit more work.

Generally, HBase is great at honing in to a subset (some 10-100m rows) of the data. Raw scan performance is not (yet) a strength of HBase.

So with HDFS you get to 75% of the theoretical maximum read throughput; hence with HBase you to 25% of the theoretical cluster wide maximum disk throughput?
-- Lars

----- Original Message -----
From: Bryan Keller <[EMAIL PROTECTED]>
Sent: Friday, May 10, 2013 8:46 AM
Subject: Re: Poor HBase map-reduce scan performance

FYI, I ran tests with compression on and off.

With a plain HDFS sequence file and compression off, I am getting very good I/O numbers, roughly 75% of theoretical max for reads. With snappy compression on with a sequence file, I/O speed is about 3x slower. However the file size is 3x smaller so it takes about the same time to scan.

With HBase, the results are equivalent (just much slower than a sequence file). Scanning a compressed table is about 3x slower I/O than an uncompressed table, but the table is 3x smaller, so the time to scan is about the same. Scanning an HBase table takes about 3x as long as scanning the sequence file export of the table, either compressed or uncompressed. The sequence file export file size ends up being just barely larger than the table, either compressed or uncompressed

So in sum, compression slows down I/O 3x, but the file is 3x smaller so the time to scan is about the same. Adding in HBase slows things down another 3x. So I'm seeing 9x faster I/O scanning an uncompressed sequence file vs scanning a compressed table.
On May 8, 2013, at 10:15 AM, Bryan Keller <[EMAIL PROTECTED]> wrote:

> Thanks for the offer Lars! I haven't made much progress speeding things up.
> I finally put together a test program that populates a table that is similar to my production dataset. I have a readme that should describe things, hopefully enough to make it useable. There is code to populate a test table, code to scan the table, and code to scan sequence files from an export (to compare HBase w/ raw HDFS). I use a gradle build script.
> You can find the code here:
> https://dl.dropboxusercontent.com/u/6880177/hbasetest.zip
> On May 4, 2013, at 6:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>> The blockbuffers are not reused, but that by itself should not be a problem as they are all the same size (at least I have never identified that as one in my profiling sessions).
>> My offer still stands to do some profiling myself if there is an easy way to generate data of similar shape.
>> -- Lars
>> ________________________________
>> From: Bryan Keller <[EMAIL PROTECTED]>
>> Sent: Friday, May 3, 2013 3:44 AM
>> Subject: Re: Poor HBase map-reduce scan performance
>> Actually I'm not too confident in my results re block size, they may have been related to major compaction. I'm going to rerun before drawing any conclusions.
>> On May 3, 2013, at 12:17 AM, Bryan Keller <[EMAIL PROTECTED]> wrote:
>>> I finally made some progress. I tried a very large HBase block size (16mb), and it significantly improved scan performance. I went from 45-50 min to 24 min. Not great but much better. Before I had it set to 128k. Scanning an equivalent sequence file takes 10 min. My random read performance will probably suffer with such a large block size (theoretically), so I probably can't keep it this big. I care about random read performance too. I've read having a block size this big is not recommended, is that correct?
>>> I haven't dug too deeply into the code, are the block buffers reused or is each new block read a new allocation? Perhaps a buffer pool could help here if there isn't one already. When doing a scan, HBase could reuse previously allocated block buffers instead of allocating a new one for each block. Then block size shouldn't affect scan performance much.