Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Offline merge tool question


+
Bryan Beaudreault 2013-08-14, 00:17
+
Stack 2013-08-14, 05:32
+
Bryan Beaudreault 2013-08-14, 15:18
Copy link to this message
-
Re: Offline merge tool question
On Wed, Aug 14, 2013 at 8:18 AM, Bryan Beaudreault <[EMAIL PROTECTED]
> wrote:

> Thanks Stack.  We are going to test this on a test table in QA, but I'd
> still like a fallback plan if something goes wrong when we eventually do it
> in prod.
>
> One idea I had was to snapshot the table, clone from the snapshot, and
> perform the merge on the result of the clone.  I imagine I'd first want to
> major compact the clone, so that we rewrite all of the linked files into
> new files.  I also see at the end of this blog post (
>
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/
> )
> that merging regions on a snapshot table can cause data loss.
>
> Does my approach sound reasonable?  Disable table, snapshot table, create
> clone from snapshot, major compact clone, run merge on clone, enable clone,
> test, if fail fall-back to original table.
>
>
 "...so that we rewrite all of the linked files into new files...."

<pinch-of-salt>I haven't looked at it in a while but I thought merge wrote
new files under the new merge region?  If so, won't this undo references so
no need for the major compaction step?</pinch-of-salt>

St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB