Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Tables gets Major Compacted even if they haven't changed

Premal Shah 2013-09-10, 04:02
Copy link to this message
Re: Tables gets Major Compacted even if they haven't changed
Interesting. I guess we could add a check to avoid major compactions if (1) no TTL is set or we can show that all data is newer and (2) there's only one file (3) and there are no delete markers. All of these can be cheaply checked with some HFile metadata (we might have all data needed already).
That would take care of both of your scenarios.

-- Lars
From: Premal Shah <[EMAIL PROTECTED]>
Sent: Monday, September 9, 2013 9:02 PM
Subject: Tables gets Major Compacted even if they haven't changed
We have a bunch on tables in our HBase cluster. We have a script which
makes sure all of them get Major Compacted once every 2 days. There are 2
things I'm observing

1) Table X has not updated in a month. We have not inserted, updated or
deleted data. However, it still major compacts every 2 days. All the
regions in this table have only 1 store file.

2) Table Y has a few regions where the rowkey is essentially a timestamp.
So, we only write to 1 region at a time. Over time, the region splits, and
then we write the one of the split regions. Now, whenever we major compact
the table, all regions get major compacted. Only 1 region has more than 1
store file, every other region has exactly once.

Is there a way to avoid compaction of regions that have not changed?

We are using HBase 0.94.11

Premal Shah.
Dave Latham 2013-09-10, 18:11
Premal Shah 2013-09-10, 18:39
lars hofhansl 2013-09-11, 02:12
Vladimir Rodionov 2013-09-10, 04:13
Premal Shah 2013-09-10, 04:28
Vladimir Rodionov 2013-09-10, 05:10
anil gupta 2013-09-10, 04:58