Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> best read path explanation

Copy link to this message
Re: best read path explanation
This in enforced in the serverside scanner framework (ScanQueryMatcher called by StoreScanner).
So while expired KeyValues are only physically only removed once a compaction runs, they are logically hidden by the scanner framework.
In fact the same scanner framework is used to decide whether KeyValues are visible to a user scan or during a compaction.
As for Ahmed's question, you can run the tests locally by just applying the patch to a svn checkout (I doubt it will still apply, though).
-- Lars

 From: Asaf Mesika <[EMAIL PROTECTED]>
Cc: lars hofhansl <[EMAIL PROTECTED]>
Sent: Monday, January 14, 2013 10:47 AM
Subject: Re: best read path explanation

I have a follow up question here: 
A column family can be defined to have a maximum number of versions per column qualifier value. Is this enforced only by the client side code (HTable) or also by the InternalScanner implementations?

On Monday, January 14, 2013, S Ahmed  wrote:

Thanks Lars!
>Sort of a side question after following your proposed patch:
>Locally on your computer (laptop?), can those tests run in isolation or you
>need a fairly complicated setup to run them? (all the various hbase
>dependancies like zookeeper etc).
>On Sun, Jan 13, 2013 at 9:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>> Does this help:
>> http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html ?
>> ________________________________
>>  From: S Ahmed <[EMAIL PROTECTED]>
>> Sent: Sunday, January 13, 2013 7:24 AM
>> Subject: best read path explanation
>> What is the best hbase read path explanation?
>> I understand that hbase stores data and doesn't allow for mutations, so I'm
>> confused as to how a read can get the latest data?
>> I'm guessing there are merges done between the immutable file stores, and
>> in-memory stores?