Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Questions about HBase


Copy link to this message
-
Re: Questions about HBase
Ted Yu 2013-06-05, 03:44
bq.  I found this jira:  https://issues.apache.org/jira/browse/HBASE-5199 but
I dont' know if the
   compaction being talked about there is minor or major.

The optimization above applies to minor compaction selection.

Cheers

On Tue, Jun 4, 2013 at 7:15 PM, Pankaj Gupta <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I have a few small questions regarding HBase. I've searched the forum but
> couldn't find clear answers hence asking them here:
>
>
>    1. Does Minor compaction remove HFiles in which all entries are out of
>    TTL or does only Major compaction do that? I found this jira:
>    https://issues.apache.org/jira/browse/HBASE-5199 but I dont' know if
> the
>    compaction being talked about there is minor or major.
>    2. Is there a way of configuring major compaction to compact only files
>    older than a certain time or to compress all the files except the latest
>    few? We basically want to use the time based filtering optimization in
>    HBase to get the latest additions to the table and since major
> compaction
>    bunches everything into one file, it would defeat the optimization.
>    3. Is there a way to warm up the bloom filter and block index cache for
>    a table? This is for a case where I always want the bloom filters and
> index
>    to be all in memory, but not the data blocks themselves.
>    4. This one is related to what I read in the HBase definitive guide
>    bloom filter section
>    Given a random row key you are looking for, it is very likely that this
>    key will fall in between two block start keys. The only way for HBase to
>    figure out if the key actually exists is by loading the block and
> scanning
>    it to find the key.
>    The above excerpt seems to imply to me that the search for key inside a
>    block is linear and I feel I must be reading it wrong. I would expect
> the
>    scan to be a binary search.
>
>
> Thanks in Advance,
> Pankaj
>
> --
>
>
> *P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 | [EMAIL PROTECTED]
>
> Pankaj Gupta | Software Engineer
>
> *BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
>
>
> United States | Canada | United Kingdom | Germany
>
>
> We're hiring<
> http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7
> >
> !
>