Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Merge large number of regions


Copy link to this message
-
Re: Merge large number of regions
Thanks Kevin! Very useful pointers.

On Wed, Oct 17, 2012 at 7:39 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote:

> Shrijeet,
>
>   Here is a thread on doing a proper incremental with Import:
>
>
> http://hadoop-hbase.blogspot.com/2012/04/timestamp-consistent-backups-in-hbase.html
> I am a fan of this one as it is well laid out.  Breaking this up for you
> use case should be pretty easy.
>
> CopyTable should work just as easily -
> http://www.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/
>
>
> If you follow the above it is really going to be a matter of preference.
>
> On Tue, Oct 16, 2012 at 1:16 PM, Shrijeet Paliwal
> <[EMAIL PROTECTED]>wrote:
>
> > Hi Kevin,
> >
> > Thanks for answering. What are your thoughts on copyTable vs
> export-import
> > considering my use case. Will one tool have lesser chance of copying
> > inconsistent data over another?
> >
> > I wish to do increment copy of a live cluster to minimize downtime.
> >
> > On Tue, Oct 16, 2012 at 8:47 AM, Kevin O'dell <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Shrijeet,
> > >
> > >  I think a better approach would be a pre-split table and then do the
> > > export/import.  This will save you from having to script the merges,
> > which
> > > can be end badly for META if done wrong.
> > >
> > > On Mon, Oct 15, 2012 at 5:31 PM, Shrijeet Paliwal
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > > We moved to 0.92.2 some time ago and with that, increased the max
> file
> > > size
> > > > setting to 4GB (from 2GB). Also an application triggered cleanup
> > > operation
> > > > deleted lots of unwanted rows.
> > > > These two combined have gotten us to a state where lots of regions
> are
> > > > smaller than desired size.
> > > >
> > > > Merging regions two at a time seems time consuming and will be hard
> to
> > > > automate. https://issues.apache.org/jira/browse/HBASE-1621 automates
> > > > merging, but it is not stable.
> > > >
> > > > I am interested in knowing about other possible approaches folks have
> > > > tried. What do you guys think about copyTable based approach ? (old
> > > > ---copyTable---> new and then rename new to old)
> > > >
> > > > -Shrijeet
> > > >
> > >
> > >
> > >
> > > --
> > > Kevin O'Dell
> > > Customer Operations Engineer, Cloudera
> > >
> >
>
>
>
> --
> Kevin O'Dell
> Customer Operations Engineer, Cloudera
>