Thanks for answering. What are your thoughts on copyTable vs export-import
considering my use case. Will one tool have lesser chance of copying
inconsistent data over another?
I wish to do increment copy of a live cluster to minimize downtime.
On Tue, Oct 16, 2012 at 8:47 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote:
> I think a better approach would be a pre-split table and then do the
> export/import. This will save you from having to script the merges, which
> can be end badly for META if done wrong.
> On Mon, Oct 15, 2012 at 5:31 PM, Shrijeet Paliwal
> <[EMAIL PROTECTED]>wrote:
> > We moved to 0.92.2 some time ago and with that, increased the max file
> > setting to 4GB (from 2GB). Also an application triggered cleanup
> > deleted lots of unwanted rows.
> > These two combined have gotten us to a state where lots of regions are
> > smaller than desired size.
> > Merging regions two at a time seems time consuming and will be hard to
> > automate. https://issues.apache.org/jira/browse/HBASE-1621 automates
> > merging, but it is not stable.
> > I am interested in knowing about other possible approaches folks have
> > tried. What do you guys think about copyTable based approach ? (old
> > ---copyTable---> new and then rename new to old)
> > -Shrijeet
> Kevin O'Dell
> Customer Operations Engineer, Cloudera