We're doing a version of that at Salesforce (we have our own M/R jobs, but the principle is the same).
Soon we'll run the backup M/R job over a snapshot for performance reasons, but even then the principle is the same.
Specifically we're keeping 48h worth of life data in HBase itself (TTL=48h, MIN_VERSIONS=1, KEEP_DELETED_CELLS=true), and run the jobs as of 2h in the past (rounded to an exact hour boundary), every night.
I think it's time I write an updated blog post. We plan to eventually open source the tools we've written.
From: Timo Schaepe <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Monday, December 23, 2013 10:53 AM
Subject: Consistent Backup strategy
we are searching for a consistent backup strategy with the export tool. Is this article still up-to-date and I can use it?
Thanks for answers.