Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Tables gets Major Compacted even if they haven't changed


Copy link to this message
-
Re: Tables gets Major Compacted even if they haven't changed
Dave Latham 2013-09-10, 18:11
Major compactions can still be useful to improve locality - could we add a
condition to check for that too?
On Mon, Sep 9, 2013 at 10:41 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Interesting. I guess we could add a check to avoid major compactions if
> (1) no TTL is set or we can show that all data is newer and (2) there's
> only one file (3) and there are no delete markers. All of these can be
> cheaply checked with some HFile metadata (we might have all data needed
> already).
>
>
> That would take care of both of your scenarios.
>
> -- Lars
> ________________________________
> From: Premal Shah <[EMAIL PROTECTED]>
> To: user <[EMAIL PROTECTED]>
> Sent: Monday, September 9, 2013 9:02 PM
> Subject: Tables gets Major Compacted even if they haven't changed
>
>
> Hi,
> We have a bunch on tables in our HBase cluster. We have a script which
> makes sure all of them get Major Compacted once every 2 days. There are 2
> things I'm observing
>
> 1) Table X has not updated in a month. We have not inserted, updated or
> deleted data. However, it still major compacts every 2 days. All the
> regions in this table have only 1 store file.
>
> 2) Table Y has a few regions where the rowkey is essentially a timestamp.
> So, we only write to 1 region at a time. Over time, the region splits, and
> then we write the one of the split regions. Now, whenever we major compact
> the table, all regions get major compacted. Only 1 region has more than 1
> store file, every other region has exactly once.
>
> Is there a way to avoid compaction of regions that have not changed?
>
> We are using HBase 0.94.11
>
> --
> Regards,
> Premal Shah.
>