Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Smart Managed Major Compactions


Copy link to this message
-
Smart Managed Major Compactions
Hello all,

Before I start, I'm running cdh3u2, so 0.90.4.

I am looking into managing major compactions ourselves, but there doesn't appear to be any mechanisms I can hook in to determine which tables need compacting.  Ideally each time my cron job runs it would compact the table with the next longest time since compaction, but I can't find a way to access this metric.

The default major compaction algorithm seems to be able to get the oldest modified time for all store files for a region to determine when it was last major compacted.  I know this is not ideal, but it seems good enough.  Unfortunately I don't see an easy way to get this.

Alternatively I can keep my own compaction log, but I'd rather not have to do that if there is another way.  Is there some easy way to access this value that I am not seeing?  I know I could construct the paths to store files myself, but this seems less than ideal as well (i.e. might break when we upgrade, etc).

Thanks

--
Bryan Beaudreault

+
Stack 2012-07-19, 00:52
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB