Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Offline merge tool question

Bryan Beaudreault 2013-08-14, 00:17
Stack 2013-08-14, 05:32
Bryan Beaudreault 2013-08-14, 15:18
Copy link to this message
Re: Offline merge tool question
On Wed, Aug 14, 2013 at 8:18 AM, Bryan Beaudreault <[EMAIL PROTECTED]
> wrote:

> Thanks Stack.  We are going to test this on a test table in QA, but I'd
> still like a fallback plan if something goes wrong when we eventually do it
> in prod.
> One idea I had was to snapshot the table, clone from the snapshot, and
> perform the merge on the result of the clone.  I imagine I'd first want to
> major compact the clone, so that we rewrite all of the linked files into
> new files.  I also see at the end of this blog post (
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/
> )
> that merging regions on a snapshot table can cause data loss.
> Does my approach sound reasonable?  Disable table, snapshot table, create
> clone from snapshot, major compact clone, run merge on clone, enable clone,
> test, if fail fall-back to original table.
 "...so that we rewrite all of the linked files into new files...."

<pinch-of-salt>I haven't looked at it in a while but I thought merge wrote
new files under the new merge region?  If so, won't this undo references so
no need for the major compaction step?</pinch-of-salt>