Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Early comparisons between 0.90 and 0.92

Copy link to this message
Re: Early comparisons between 0.90 and 0.92
Thanks for the info, J-D.

I guess the 1.1 below is in millions.

Can you tell us more about your tables - bloom filters, etc ?

在 Dec 14, 2011,5:26 PM,Jean-Daniel Cryans <[EMAIL PROTECTED]> 写道:

> Hey guys,
> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
> regarding reads. The numbers are kinda irrelevant but the differences
> are. BTW this is on CDH3u3 with random reads.
> In 0.90.0, scanning 50M rows that are in the OS cache I go up to about
> 1.7M rows scanned per second.
> In 0.92.0, scanning those same rows (meaning that I didn't run
> compactions after migrating so it's picking the same data from the OS
> cache), I scan about 1.1 rows per second.
> 0.92 is 50% slower when scanning.
> In 0.90.0 random reading 50M rows that are OS cached I can do about
> 200k reads per second.
> In 0.92.0, again with those same rows, I can go up to 260k per second.
> 0.92 is 30% faster when random reading.
> I've been playing with that data set for a while and the numbers in
> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that
> something else changed or the code that's generic to both did.
> I'd like to be able to associate those differences to code changes in
> order to understand what's going on. I would really appreciate if
> others also took some time to test it out or to think about what could
> cause this.
> Thx,
> J-D