Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Questions about HBase


Copy link to this message
-
Re: Questions about HBase
When you do the first read of this region, wouldn't this load all bloom
filters?

On Wed, Jun 5, 2013 at 8:43 AM, ramkrishna vasudevan <
[EMAIL PROTECTED]> wrote:

> for the question whether you will be able to do a warm up for the bloom and
> block cache i don't think it is possible now.
>
> Regards
> Ram
>
>
> On Wed, Jun 5, 2013 at 10:57 AM, Asaf Mesika <[EMAIL PROTECTED]>
> wrote:
>
> > If you will read HFile v2 document on HBase site you will understand
> > completely how the search for a record works and why there is linear
> search
> > in the block but binary search to get to the right block.
> > Also bear in mind the amount of keys in a blocks is not big since a block
> > in HFile by default is 65k, thus from a 10GB HFile you are only fully
> > scanning 65k out of it.
> >
> > On Wednesday, June 5, 2013, Pankaj Gupta wrote:
> >
> > > Thanks for the replies. I'll take a look at src/main/java/org/apache/
> > > hadoop/hbase/coprocessor/BaseRegionObserver.java.
> > >
> > > @ramkrishna: I do want to have bloom filter and block index all the
> time.
> > > For good read performance they're critical in my workflow. The worry is
> > > that when HBase is restarted it will take a long time for them to get
> > > populated again and performance will suffer. If there was a way of
> > loading
> > > them quickly and warm up the table then we'll be able to restart HBase
> > > without causing slow down in processing.
> > >
> > >
> > > On Tue, Jun 4, 2013 at 9:29 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > >
> > > > bq. But i am not very sure if we can control the files getting
> selected
> > > for
> > > > compaction in the older verisons.
> > > >
> > > > Same mechanism is available in 0.94
> > > >
> > > > Take a look
> > > > at
> > > >
> > src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
> > > > where you would find the following methods (and more):
> > > >
> > > >   public void preCompactSelection(final
> > > > ObserverContext<RegionCoprocessorEnvironment> c,
> > > >       final Store store, final List<StoreFile> candidates, final
> > > > CompactionRequest request)
> > > >   public InternalScanner
> > > > preCompact(ObserverContext<RegionCoprocessorEnvironment> e,
> > > >       final Store store, final InternalScanner scanner) throws
> > > IOException
> > > > {
> > > >
> > > > Cheers
> > > >
> > > > On Tue, Jun 4, 2013 at 8:14 PM, ramkrishna vasudevan <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > >>Does Minor compaction remove HFiles in which all entries are out
> of
> > > > >    TTL or does only Major compaction do that
> > > > > Yes it applies for Minor compactions.
> > > > > >>Is there a way of configuring major compaction to compact only
> > files
> > > > >    older than a certain time or to compress all the files except
> the
> > > > latest
> > > > >    few?
> > > > > In the latest trunk version the compaction algo itself can be
> > plugged.
> > > > >  There are some coprocessor hooks that gives control on the scanner
> > > that
> > > > > gets created for compaction with which we can control the KVs being
> > > > > selected. But i am not very sure if we can control the files
> getting
> > > > > selected for compaction in the older verisons.
> > > > > >> The above excerpt seems to imply to me that the search for key
> > > inside
> > > > a
> > > > > block
> > > > > is linear and I feel I must be reading it wrong. I would expect the
> > > scan
> > > > to
> > > > > be a binary search.
> > > > > Once the data block is identified for a key, we seek to the
> beginning
> > > of
> > > > > the block and then do a linear search until we reach the exact key
> > that
> > > > we
> > > > > are looking out for.  Because internally the data (KVs) are stored
> as
> > > > byte
> > > > > buffers per block and it follows this pattern
> > > > > <keylength><valuelength><keybytearray><valuebytearray>
> > > > > >>Is there a way to warm up the bloom filter and block index cache
> > for
> > > > >    a table?