Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> backup strategies


+
Rita 2012-08-16, 11:31
+
Paul Mackles 2012-08-16, 11:53
+
Rita 2012-08-22, 10:48
Copy link to this message
-
Re: backup strategies
Lets say I have a huge table and I want to back it up onto system with a
lot of disk space. Would this work, take all the keys and export the
database in chunks by selectively picking a range. For instance if the keys
are from 0-100000, I would say backup key 0-50000 into backup_dir_A and
50001-100000 to backup_dir_B . Would the be feasible?

On Wed, Aug 22, 2012 at 6:48 AM, Rita <[EMAIL PROTECTED]> wrote:

> what is the typical conversion process? My biggest worry is I come from a
> higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1.
>
>
>
> On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles <[EMAIL PROTECTED]> wrote:
>
>> Hi Rita
>>
>> By default, the export that ships with hbase writes KeyValue objects to a
>> sequence file. It is a very simple app and it wouldn't be hard to roll
>> your own export program to write to whatever format you wanted (its a very
>> simple app). You can use the current export program as a basis and just
>> change the output of the mapper.
>>
>> I will say that I spent a lot of time thinking about backups and DR and I
>> didn't really worry much about hbase versions. The file formats for hbase
>> don't change that often and when they do, there is usually a pretty
>> straight-forward conversion process. Also, if you are doing something like
>> full daily backups then I am having trouble imagining a scenario where you
>> would need to restore from anything but the most recent backup.
>>
>> Depending on which version of hbase you are using, there are probably much
>> bigger issues with using export for backups that you should worry about
>> like being able to restore in a timely fashion, preserving deletes and
>> impact of the backup procress on your SLA.
>>
>> Paul
>>
>>
>> On 8/16/12 7:31 AM, "Rita" <[EMAIL PROTECTED]> wrote:
>>
>> >I am sure this topic has been visited many times but I though I ask to
>> see
>> >if anything changed.
>> >
>> >We are using hbase with close to 40b rows and backing up the data is
>> >non-trivial. We can use export table to another Hadoop/HDFS filesystem
>> but
>> >I am not aware of any guaranteed way of preserving data from one version
>> >of
>> >Hbase to another (specifically if its very old) . Is there a program
>> which
>> >will serialize the data into JSON/XML and dump it on a Unix filesystem?
>> >Once I get the data we can compress it whatever we like and back it up
>> >using our internal software.
>> >
>> >
>> >
>> >
>> >--
>> >--- Get your facts first, then you can distort them as you please.--
>>
>>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

--
--- Get your facts first, then you can distort them as you please.--
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB