Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> backup strategies


+
Rita 2012-08-16, 11:31
+
Paul Mackles 2012-08-16, 11:53
+
Rita 2012-08-22, 10:48
Copy link to this message
-
Re: backup strategies
Lets say I have a huge table and I want to back it up onto system with a
lot of disk space. Would this work, take all the keys and export the
database in chunks by selectively picking a range. For instance if the keys
are from 0-100000, I would say backup key 0-50000 into backup_dir_A and
50001-100000 to backup_dir_B . Would the be feasible?

On Wed, Aug 22, 2012 at 6:48 AM, Rita <[EMAIL PROTECTED]> wrote:

> what is the typical conversion process? My biggest worry is I come from a
> higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1.
>
>
>
> On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles <[EMAIL PROTECTED]> wrote:
>
>> Hi Rita
>>
>> By default, the export that ships with hbase writes KeyValue objects to a
>> sequence file. It is a very simple app and it wouldn't be hard to roll
>> your own export program to write to whatever format you wanted (its a very
>> simple app). You can use the current export program as a basis and just
>> change the output of the mapper.
>>
>> I will say that I spent a lot of time thinking about backups and DR and I
>> didn't really worry much about hbase versions. The file formats for hbase
>> don't change that often and when they do, there is usually a pretty
>> straight-forward conversion process. Also, if you are doing something like
>> full daily backups then I am having trouble imagining a scenario where you
>> would need to restore from anything but the most recent backup.
>>
>> Depending on which version of hbase you are using, there are probably much
>> bigger issues with using export for backups that you should worry about
>> like being able to restore in a timely fashion, preserving deletes and
>> impact of the backup procress on your SLA.
>>
>> Paul
>>
>>
>> On 8/16/12 7:31 AM, "Rita" <[EMAIL PROTECTED]> wrote:
>>
>> >I am sure this topic has been visited many times but I though I ask to
>> see
>> >if anything changed.
>> >
>> >We are using hbase with close to 40b rows and backing up the data is
>> >non-trivial. We can use export table to another Hadoop/HDFS filesystem
>> but
>> >I am not aware of any guaranteed way of preserving data from one version
>> >of
>> >Hbase to another (specifically if its very old) . Is there a program
>> which
>> >will serialize the data into JSON/XML and dump it on a Unix filesystem?
>> >Once I get the data we can compress it whatever we like and back it up
>> >using our internal software.
>> >
>> >
>> >
>> >
>> >--
>> >--- Get your facts first, then you can distort them as you please.--
>>
>>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

--
--- Get your facts first, then you can distort them as you please.--