|
|
-
HBase row level cache for random read
Gen Liu 2012-08-17, 23:42
Hi,
I'm dealing with latency sensitive random read application on HBase. It seems that block cache is designed for sequential read. I use the default 64k block size which is much bigger than my row (10k after GZ compression). I assume block cache store compressed data, one block can hold 6 rows, but in random read, maybe 1 row is ever accessed, 5/6 of the cache space is wasted. Is there a better way of caching for random read. Lower the block size to 32k or even 16k might be a choice.
Thanks!
+
Gen Liu 2012-08-17, 23:42
-
Re: HBase row level cache for random read
Stack 2012-08-18, 19:33
On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu <[EMAIL PROTECTED]> wrote: > I assume block cache store compressed data, Generally its not, not unless you use block encoding. > one block can hold 6 rows, but in random read, maybe 1 row is ever accessed, 5/6 of the cache space is wasted. > Is there a better way of caching for random read. Lower the block size to 32k or even 16k might be a choice. > We don't seem to list this as an option in this section, http://hbase.apache.org/book.html#perf.reading, but yes, if lots of random reads, smaller block cache could make a difference. St.Ack
+
Stack 2012-08-18, 19:33
-
Re: HBase row level cache for random read
Elliott Clark 2012-08-20, 15:38
Also if you are trying to limit the number of blocks read on random read work loads make sure that you have bloom filters on those tables. Having bloom filters turned on will limit the number of blocks that are read into memory. On Sat, Aug 18, 2012 at 12:33 PM, Stack <[EMAIL PROTECTED]> wrote: > On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu <[EMAIL PROTECTED]> wrote: >> I assume block cache store compressed data, > > Generally its not, not unless you use block encoding. > >> one block can hold 6 rows, but in random read, maybe 1 row is ever accessed, 5/6 of the cache space is wasted. >> Is there a better way of caching for random read. Lower the block size to 32k or even 16k might be a choice. >> > > We don't seem to list this as an option in this section, > http://hbase.apache.org/book.html#perf.reading, but yes, if lots of > random reads, smaller block cache could make a difference. > > St.Ack
+
Elliott Clark 2012-08-20, 15:38
-
Re: HBase row level cache for random read
Gen Liu 2012-08-23, 19:06
On 8/18/12 12:33 PM, "Stack" <[EMAIL PROTECTED]> wrote: >On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu <[EMAIL PROTECTED]> wrote: >> I assume block cache store compressed data, > >Generally its not, not unless you use block encoding. Can you be more specific on this? Are you talking about https://issues.apache.org/jira/browse/HBASE-4218So this is only available in 0.94 then? Thanks. > >> one block can hold 6 rows, but in random read, maybe 1 row is ever >>accessed, 5/6 of the cache space is wasted. >> Is there a better way of caching for random read. Lower the block size >>to 32k or even 16k might be a choice. >> > >We don't seem to list this as an option in this section, > http://hbase.apache.org/book.html#perf.reading, but yes, if lots of >random reads, smaller block cache could make a difference. > >St.Ack
+
Gen Liu 2012-08-23, 19:06
-
Re: HBase row level cache for random read
Stack 2012-08-23, 22:28
On Thu, Aug 23, 2012 at 12:06 PM, Gen Liu <[EMAIL PROTECTED]> wrote: > > > On 8/18/12 12:33 PM, "Stack" <[EMAIL PROTECTED]> wrote: > >>On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu <[EMAIL PROTECTED]> wrote: >>> I assume block cache store compressed data, >> >>Generally its not, not unless you use block encoding. > Can you be more specific on this? Are you talking about > https://issues.apache.org/jira/browse/HBASE-4218> So this is only available in 0.94 then? Thanks. >> >>> one block can hold 6 rows, but in random read, maybe 1 row is ever >>>accessed, 5/6 of the cache space is wasted. >>> Is there a better way of caching for random read. Lower the block size >>>to 32k or even 16k might be a choice. >>> >> >>We don't seem to list this as an option in this section, >> http://hbase.apache.org/book.html#perf.reading, but yes, if lots of >>random reads, smaller block cache could make a difference. >> See release note in https://issues.apache.org/jira/browse/HBASE-4218St.Ack
+
Stack 2012-08-23, 22:28
|
|