Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Export / Import and table splits


Copy link to this message
-
Re: Export / Import and table splits
I'll go with the snapshots since you can avoid all the I/O of the
import/export but the consistency model is different, and you don't have
the start/end time option... you should delete the rows < tstart and > tend
after the clone

Matteo

On Tue, May 14, 2013 at 1:48 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Jeremy,
>
> Thanks for sharing this.
>
> I will take a look at it, and also most probably give a try to the snapshot
> option....
>
> JM
>
> 2013/5/7 Jeremy Carroll <[EMAIL PROTECTED]>
>
> >
> >
> https://github.com/phobos182/hadoop-hbase-tools/blob/master/hbase/copy_table.rb
> >
> > I wrote a quick script to do it with mechanize + ruby. I have a new tool
> > which I'm polishing up that does the same thing in Python but using the
> > HBase REST interface to get the data.
> >
> >
> > On Tue, May 7, 2013 at 3:23 PM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]
> > > wrote:
> >
> > > Hi,
> > >
> > > When we are doing an export, we are only exporting the data. Then when
> > > we are importing that back, we need to make sure the table is
> > > pre-splitted correctly else we might hotspot some servers.
> > >
> > > If you simply export then import without pre-splitting at all, you
> > > will most probably brought some servers down because they will be
> > > overwhelmed with splits and compactions.
> > >
> > > Do we have any tool to pre-split a table the same way another table is
> > > already pre-splitted?
> > >
> > > Something like
> > > > duplicate 'source_table', 'target_table'
> > >
> > > Which will create a new table called 'target_table' with exactly the
> > > same parameters as 'source_table' and the same regions boundaries?
> > >
> > > If we don't have, will it be useful to have one?
> > >
> > > Or event something like:
> > > > create 'target_table', 'f1', {SPLITS_MODEL => 'source_table'}
> > >
> > >
> > > JM
> > >
> >
>