Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Bloom Filter


Copy link to this message
-
Re: Bloom Filter
On Fri, Jul 27, 2012 at 7:25 AM, Alex Baranau <[EMAIL PROTECTED]>wrote:

> Very good explanation (and food for thinking) about using bloom filters in
> HBase in answers here:
> http://www.quora.com/How-are-bloom-filters-used-in-HBase.
>
> Should we put the link to it from Apache HBase book (ref guide)?
>

Thanks this is helpful

>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>
> On Thu, Jul 26, 2012 at 8:38 PM, Mohit Anchlia <[EMAIL PROTECTED]
> >wrote:
>
> > On Thu, Jul 26, 2012 at 1:52 PM, Minh Duc Nguyen <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Mohit,
> > >
> > > According to HBase: The Definitive Guide,
> > >
> > > The row+column Bloom filter is useful when you cannot batch updates
> for a
> > > specific row, and end up with store files which all contain parts of
> the
> > > row. The more specific row+column filter can then identify which of the
> > > files contain the data you are requesting. Obviously, if you always
> load
> > > the entire row, this filter is once again hardly useful, as the region
> > > server will need to load the matching block out of each file anyway.
> >  Since
> > > the row+column filter will require more storage, you need to do the
> math
> > to
> > > determine whether it is worth the extra resources.
> > >
> >
> > Thanks! I have a timeseries data so I am thinking I should enable bloom
> > filters for only rows
> >
> > >
> > >
> > >    ~ Minh
> > >
> > > On Thu, Jul 26, 2012 at 4:30 PM, Mohit Anchlia <[EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Is it advisable to enable bloom filters on the column family?
> > > >
> > > > Also, why is it called global kill switch?
> > > >
> > > > Bloom Filter Configuration
> > > >   2.9.1. io.hfile.bloom.enabled global kill switch
> > > >
> > > > io.hfile.bloom.enabled in Configuration serves as the kill switch in
> > case
> > > > something goes wrong. Default = true.
> > > >
> > >
> >
>
>
>
> --
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>