Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Does HBase RegionServer benefit from OS Page Cache


Copy link to this message
-
Re: Does HBase RegionServer benefit from OS Page Cache
> With very large heaps and a GC that can handle them (perhaps the G1 GC),
another option which might be worth experimenting with is a value (KV)
cache independent of the block cache which could be enabled on a per-table
basis
Thanks Andy for bringing this up. We've had some discussions some time ago
about a row-cache (or KV cache)
http://search-hadoop.com/m/XTlxT1xRtYw/hbase+key+value+cache+from%253Aenis&subj=RE+keyvalue+cache

The takeaway was that if you are mostly doing point gets, rather than
scans, this cache might be better.

> 1) [HBASE-7404]: L1/L2 block cache
I knew about the Bucket cache, but not that bucket cache could hold
compressed blocks. Is it the case, or are you suggesting we can add that to
this L2 cache.

>  2) [HBASE-5263] Preserving cached data on compactions through
cache-on-write
Thanks, this is the same idea. I'll track the ticket.

Enis
On Mon, Mar 25, 2013 at 12:18 PM, Liyin Tang <[EMAIL PROTECTED]> wrote:

> Hi Enis,
> Good ideas ! And hbase community is driving on these 2 items.
> 1) [HBASE-7404]: L1/L2 block cache
> 2) [HBASE-5263] Preserving cached data on compactions through
> cache-on-write
>
> Thanks a lot
> Liyin
> ________________________________________
> From: Enis Söztutar [[EMAIL PROTECTED]]
> Sent: Monday, March 25, 2013 11:24 AM
> To: hbase-user
> Cc: lars hofhansl
> Subject: Re: Does HBase RegionServer benefit from OS Page Cache
>
> Thanks Liyin for sharing your use cases.
>
> Related to those, I was thinking of two improvements:
>  - AFAIK, MySQL keeps the compressed and uncompressed versions of the blocs
> in its block cache, failing over the compressed one if decompressed one
> gets evicted. With very large heaps, maybe keeping around the compressed
> blocks in a secondary cache makes sense?
>  - A compaction will trash the cache. But maybe we can track keyvalues
> (inside cached blocks are cached) for the files in the compaction, and mark
> the blocks of the resulting compacted file which contain previously cached
> keyvalues to be cached after the compaction. I have to research the
> feasibility of this approach.
>
> Enis
>
>
> On Sun, Mar 24, 2013 at 10:15 PM, Liyin Tang <[EMAIL PROTECTED]> wrote:
>
> > Block cache is for uncompressed data while OS page contains the
> compressed
> > data. Unless the request pattern is full-table sequential scan, the block
> > cache is still quite useful. I think the size of the block cache should
> be
> > the amont of hot data we want to retain within a compaction cycle, which
> is
> > quite hard to estimate in some use cases.
> >
> >
> > Thanks a lot
> > Liyin
> > ________________________________________
> > From: lars hofhansl [[EMAIL PROTECTED]]
> > Sent: Saturday, March 23, 2013 10:20 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Does HBase RegionServer benefit from OS Page Cache
> >
> > Interesting.
> >
> > > 2) The blocks in the block cache will be naturally invalid quickly
> after
> > the compactions.
> >
> > Should one keep the block cache small in order to increase the OS page
> > cache?
> >
> > Does you data suggest we should not use the block cache at all?
> >
> >
> > Thanks.
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Liyin Tang <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Sent: Saturday, March 23, 2013 9:44 PM
> > Subject: Re: Does HBase RegionServer benefit from OS Page Cache
> >
> > We (Facebook) are closely monitoring the OS page cache hit ratio in the
> > production environments. My experience is if your data access pattern is
> > very random, then the OS page cache won't help you so much even though
> the
> > data locality is very high. On the other hand, if the requests are always
> > against the recent data points, then the page cache hit ratio could be
> much
> > higher.
> >
> > Actually, there are lots of optimizations could be done in HDFS. For
> > example, we are working on fadvice away the 2nd/3rd replicated data from
> OS
> > page cache so that it potentially could improve your OS page cache by 3X.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB