Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Backup Strategies


Copy link to this message
-
Re: Backup Strategies
On Fri, May 31, 2013 at 2:39 PM, Billie Rinaldi <[EMAIL PROTECTED]>wrote:

> I'm not sure copying data out of HDFS is what you would want to do, though
> I suppose it depends on how much data you're storing there.  If you want a
> backup on a different system, but you have too much data to store outside
> of a distributed file system, you could consider using distcp to copy from
> one HDFS instance to another.
>
> You can't clone the !METADATA table.  In 1.5.0, you can export and import
> tables, which is designed to help you copy a table to a different cluster
> (see docs/examples/README.export).  Cloning your tables could help, but in
> the case of !METADATA corruption you're still in the position of manually
> creating a new table with the same configuration (and split points if you
> know them) and bulk importing the old data files.  I don't know if table
> export could be used to back up the metadata and configuration of a cloned
> table to help you recover its state later on the same system if the
> original table has gotten corrupted.  It's possible.
>

Export table will save the tables state (whats in !METADATA in zookeeper)
to a zipfile.  So even if you do not actually copy the exported table, it
can be used to save table metadata.   I made comment on ACCUMULO-942 about
using export table to obtain a consistent snapshot of HDFS and Accumulo
metadata using export table.  That system metadata could be backed up.

>
>
> Billie
>
>
> On Fri, May 31, 2013 at 11:05 AM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>
>> I'm curious to know how people are backing up data in Accumulo.
>>
>> We are planning on copying data out of HDFS on a some regular basis to be
>> able to do full restore.
>>
>> We've also ended up getting into a state of having a corrupt !METADATA
>> table a few times.  I'm wondering if doing a clone on a few tables on a
>> periodic basis (like every hour, for a few hours) might be one way to help
>> us recover from that situation.
>>
>> E.g if we did a clone on all tables, including the !METADATA table
>> hourly, and we didn't necessarily care about losing data in the last hour
>> time frame, could we simply restore from one of those clones if we get into
>> a corrupted state?
>>
>> Is there another mechanism for snapshotting / backing up data in Accumulo?
>>
>> Thanks for your thoughts!
>>
>> Mike
>>
>
>