Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Smart Managed Major Compactions


Copy link to this message
-
Smart Managed Major Compactions
Hello all,

Before I start, I'm running cdh3u2, so 0.90.4.

I am looking into managing major compactions ourselves, but there doesn't appear to be any mechanisms I can hook in to determine which tables need compacting.  Ideally each time my cron job runs it would compact the table with the next longest time since compaction, but I can't find a way to access this metric.

The default major compaction algorithm seems to be able to get the oldest modified time for all store files for a region to determine when it was last major compacted.  I know this is not ideal, but it seems good enough.  Unfortunately I don't see an easy way to get this.

Alternatively I can keep my own compaction log, but I'd rather not have to do that if there is another way.  Is there some easy way to access this value that I am not seeing?  I know I could construct the paths to store files myself, but this seems less than ideal as well (i.e. might break when we upgrade, etc).

Thanks

--
Bryan Beaudreault