Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Next big thing for HBase


Copy link to this message
-
Re: Next big thing for HBase
Thanks Varun.
seekTo is the worst case, though, and not representative for scanning; but it would be representative for gets.

It needs to look up the right block in the index again and seek from the beginning of the block found.
reseek should be doing much better.
500 iops across 4 HDD seems reasonable to me :)
-- Lars

________________________________
 From: Varun Sharma <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, November 27, 2013 9:55 AM
Subject: Re: Next big thing for HBase
 

I think I sent that too early - I could buy the results for the HDD
comparison but I dont buy the Fusion I/O comparison at all. I have been
able to push it much much more on SSD(s) on EC2. It could well be that they
are not maxing out the region servers.

On Wed, Nov 27, 2013 at 9:53 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> I could buy these results for a totally disk bound application as far as
> reads go. I was running some experiments where I have HFiles on disk.
> Memory : data ratio is 1:2 - so half the data can fit in memory. Then I run
> "new HFileScanner()" and then scanner.seekTo("someKeyValue"). On a 4 HDD
> system, I can get ~400 reads max. The hard drives end run quite hot - and
> the max I can push this thing to is 500 reads per second. Note that this is
> raw HFile seeks - no HBase or HDFS layers are present. I suspect HBase just
> issues way more iops than it needs to do.
>
> Varun
>
>
> On Wed, Nov 27, 2013 at 12:01 AM, Vladimir Rodionov <
> [EMAIL PROTECTED]> wrote:
>
>> Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
>> optimization and tuning of HBase itself.
>>
>>
>> On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
>> <[EMAIL PROTECTED]>wrote:
>>
>> > Why do you think I got excited? I do not work for MapR. MapR has posted
>> > benchmark results and some numbers for HBase look quite low. I thought
>> may
>> > be community will be interested in these results.
>> >
>> >
>> > On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >> Excuse me if I do not get too exited about a report published by MapR
>> >> that comes to the conclusion that MapR's M7 is faster than "other
>> >> distribution".
>> >>
>> >> -- Lars
>> >>
>> >>
>> >>
>> >> ________________________________
>> >>  From: Vladimir Rodionov <[EMAIL PROTECTED]>
>> >> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> >> Sent: Tuesday, November 26, 2013 8:00 PM
>> >> Subject: Next big thing for HBase
>> >>
>> >>
>> >> Global optimization and performance tuning:
>> >>
>> >>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>> >>
>> >> Some numbers from this report does not look right for HBase. I do not
>> >> believe that 5 RS on Fusion drive scores only 1605 reads per sec per
>> node.
>> >>
>> >
>> >
>>
>
>