Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Bloom Filter


+
Mohit Anchlia 2012-07-26, 20:30
+
Minh Duc Nguyen 2012-07-26, 20:52
Copy link to this message
-
Re: Bloom Filter
Mohit Anchlia 2012-07-27, 00:38
On Thu, Jul 26, 2012 at 1:52 PM, Minh Duc Nguyen <[EMAIL PROTECTED]> wrote:

> Mohit,
>
> According to HBase: The Definitive Guide,
>
> The row+column Bloom filter is useful when you cannot batch updates for a
> specific row, and end up with store files which all contain parts of the
> row. The more specific row+column filter can then identify which of the
> files contain the data you are requesting. Obviously, if you always load
> the entire row, this filter is once again hardly useful, as the region
> server will need to load the matching block out of each file anyway.  Since
> the row+column filter will require more storage, you need to do the math to
> determine whether it is worth the extra resources.
>

Thanks! I have a timeseries data so I am thinking I should enable bloom
filters for only rows

>
>
>    ~ Minh
>
> On Thu, Jul 26, 2012 at 4:30 PM, Mohit Anchlia <[EMAIL PROTECTED]
> >wrote:
>
> > Is it advisable to enable bloom filters on the column family?
> >
> > Also, why is it called global kill switch?
> >
> > Bloom Filter Configuration
> >   2.9.1. io.hfile.bloom.enabled global kill switch
> >
> > io.hfile.bloom.enabled in Configuration serves as the kill switch in case
> > something goes wrong. Default = true.
> >
>
+
Alex Baranau 2012-07-27, 14:25
+
Mohit Anchlia 2012-07-27, 23:09
+
Stack 2012-07-27, 21:40