Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> best read path explanation


Copy link to this message
-
Re: best read path explanation
This in enforced in the serverside scanner framework (ScanQueryMatcher called by StoreScanner).
So while expired KeyValues are only physically only removed once a compaction runs, they are logically hidden by the scanner framework.
In fact the same scanner framework is used to decide whether KeyValues are visible to a user scan or during a compaction.
As for Ahmed's question, you can run the tests locally by just applying the patch to a svn checkout (I doubt it will still apply, though).
-- Lars

________________________________
 From: Asaf Mesika <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Cc: lars hofhansl <[EMAIL PROTECTED]>
Sent: Monday, January 14, 2013 10:47 AM
Subject: Re: best read path explanation
 

I have a follow up question here: 
A column family can be defined to have a maximum number of versions per column qualifier value. Is this enforced only by the client side code (HTable) or also by the InternalScanner implementations?

On Monday, January 14, 2013, S Ahmed  wrote:

Thanks Lars!
>
>Sort of a side question after following your proposed patch:
>https://issues.apache.org/jira/secure/attachment/12511771/5268-v5.txt
>
>Locally on your computer (laptop?), can those tests run in isolation or you
>need a fairly complicated setup to run them? (all the various hbase
>dependancies like zookeeper etc).
>
>
>On Sun, Jan 13, 2013 at 9:33 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Does this help:
>> http://hadoop-hbase.blogspot.com/2012/01/scanning-in-hbase.html ?
>>
>>
>>
>>
>> ________________________________
>>  From: S Ahmed <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Sent: Sunday, January 13, 2013 7:24 AM
>> Subject: best read path explanation
>>
>> What is the best hbase read path explanation?
>>
>> I understand that hbase stores data and doesn't allow for mutations, so I'm
>> confused as to how a read can get the latest data?
>>
>> I'm guessing there are merges done between the immutable file stores, and
>> in-memory stores?
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB