Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Offline merge tool question


Copy link to this message
-
Re: Offline merge tool question
Stack 2013-08-14, 16:41
On Wed, Aug 14, 2013 at 8:18 AM, Bryan Beaudreault <[EMAIL PROTECTED]
> wrote:

> Thanks Stack.  We are going to test this on a test table in QA, but I'd
> still like a fallback plan if something goes wrong when we eventually do it
> in prod.
>
> One idea I had was to snapshot the table, clone from the snapshot, and
> perform the merge on the result of the clone.  I imagine I'd first want to
> major compact the clone, so that we rewrite all of the linked files into
> new files.  I also see at the end of this blog post (
>
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/
> )
> that merging regions on a snapshot table can cause data loss.
>
> Does my approach sound reasonable?  Disable table, snapshot table, create
> clone from snapshot, major compact clone, run merge on clone, enable clone,
> test, if fail fall-back to original table.
>
>
 "...so that we rewrite all of the linked files into new files...."

<pinch-of-salt>I haven't looked at it in a while but I thought merge wrote
new files under the new merge region?  If so, won't this undo references so
no need for the major compaction step?</pinch-of-salt>

St.Ack