Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Cache invalidation in Blockcache


Copy link to this message
-
Re: Cache invalidation in Blockcache
Very true.
We have been discussing additional table options that would promise that the client won't set any timestamps.
With that we'd never have newer versions in older HFiles (or vice versa).

So the memstore would have the newest version (if any), we could also potentially avoid looking into all HFiles (i.e. only seek into later HFiles when we did not find what we're looking for in the earlier ones). In the worst case we'd obviously still have to seek all files.

(All assuming that the time a region needs to move to another RegionServer is greater than the time difference between RegionServers)
Maybe it's time to warm this up again... Before HBase 1.0.
________________________________
 From: Vladimir Rodionov <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Sunday, March 30, 2014 1:29 PM
Subject: RE: Cache invalidation in Blockcache
 

The problem is how to determine "the latest version" w/o checking
memstore and all relevant HFile's. So , the answer is going to be - no.
You can not ask only MemStore and you can not rely on the fact
that MemStore always keeps the latest version of a rowkey.

HBase will pull all the versions of a rowkey from MemStore and HFile's, compare them
and only after then it will return the latest. This is why it is so important:

A. To have large block cache
B. To have working data set fit into this block cache.
C. To use Bloom filter's to avoid checking HFiles which do not contain versions of a requested rowkey.

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: [EMAIL PROTECTED]

________________________________________

From: chandra kant [[EMAIL PROTECTED]]
Sent: Sunday, March 30, 2014 12:12 AM
To: [EMAIL PROTECTED]
Subject: Re: Cache invalidation in Blockcache

I am using habse 94 version . Just one clarification - if I am requesting
just a single row which is still in memstore , then read operation will
simply send back this result to client. This latest version of row won't be
cached in Blockcache. Blockcaching will only happen if data is read from
storefiles(Hfile).
What if latest version of my row is in memstore and rest 2 versions are in
Hfile and I want all 3 versions? In this case, whether cached block with
that row key will be evicted from Blockcache?

Thanks
Chandra
On Sunday, 30 March 2014, Anoop John <[EMAIL PROTECTED]> wrote:
Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or [EMAIL PROTECTED] and delete or destroy any copy of this message and its attachments.