Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Offline merge tool question


Copy link to this message
-
Re: Offline merge tool question
Thanks Stack.  We are going to test this on a test table in QA, but I'd
still like a fallback plan if something goes wrong when we eventually do it
in prod.

One idea I had was to snapshot the table, clone from the snapshot, and
perform the merge on the result of the clone.  I imagine I'd first want to
major compact the clone, so that we rewrite all of the linked files into
new files.  I also see at the end of this blog post (
http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/)
that merging regions on a snapshot table can cause data loss.

Does my approach sound reasonable?  Disable table, snapshot table, create
clone from snapshot, major compact clone, run merge on clone, enable clone,
test, if fail fall-back to original table.
On Wed, Aug 14, 2013 at 1:32 AM, Stack <[EMAIL PROTECTED]> wrote:

> On Tue, Aug 13, 2013 at 5:17 PM, Bryan Beaudreault <
> [EMAIL PROTECTED]
> > wrote:
>
> > I'm running cdh4.2 hbase 0.94.2, and am looking to merge some regions in
> a
> > table.  Looking at Merge.java, it seems to require that the entire
> cluster
> > be offline.  However, I also notice an HMerge.java which doesn't appear
> to
> > do the same validation.
> >
> > Two questions:
> >
> > 1) Why does Merge.java validate the entire cluster is down, as opposed to
> > just the single table being disabled?
> >
> >
> It is dumb/simple/old.
>
>
>
> > 2) Could I write my own tool that uses HMerge, so as to merge regions in
> > the disabled table without bringing the whole cluster down?
> >
> >
> Yes.  You can't do much harm if table is offline.
>
> St.Ack
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB