Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Backup Strategies


Copy link to this message
-
Re: Backup Strategies
On Fri, May 31, 2013 at 2:39 PM, Billie Rinaldi <[EMAIL PROTECTED]>wrote:

> I'm not sure copying data out of HDFS is what you would want to do, though
> I suppose it depends on how much data you're storing there.  If you want a
> backup on a different system, but you have too much data to store outside
> of a distributed file system, you could consider using distcp to copy from
> one HDFS instance to another.
>
> You can't clone the !METADATA table.  In 1.5.0, you can export and import
> tables, which is designed to help you copy a table to a different cluster
> (see docs/examples/README.export).  Cloning your tables could help, but in
> the case of !METADATA corruption you're still in the position of manually
> creating a new table with the same configuration (and split points if you
> know them) and bulk importing the old data files.  I don't know if table
> export could be used to back up the metadata and configuration of a cloned
> table to help you recover its state later on the same system if the
> original table has gotten corrupted.  It's possible.
>

Export table will save the tables state (whats in !METADATA in zookeeper)
to a zipfile.  So even if you do not actually copy the exported table, it
can be used to save table metadata.   I made comment on ACCUMULO-942 about
using export table to obtain a consistent snapshot of HDFS and Accumulo
metadata using export table.  That system metadata could be backed up.

>
>
> Billie
>
>
> On Fri, May 31, 2013 at 11:05 AM, Mike Hugo <[EMAIL PROTECTED]> wrote:
>
>> I'm curious to know how people are backing up data in Accumulo.
>>
>> We are planning on copying data out of HDFS on a some regular basis to be
>> able to do full restore.
>>
>> We've also ended up getting into a state of having a corrupt !METADATA
>> table a few times.  I'm wondering if doing a clone on a few tables on a
>> periodic basis (like every hour, for a few hours) might be one way to help
>> us recover from that situation.
>>
>> E.g if we did a clone on all tables, including the !METADATA table
>> hourly, and we didn't necessarily care about losing data in the last hour
>> time frame, could we simply restore from one of those clones if we get into
>> a corrupted state?
>>
>> Is there another mechanism for snapshotting / backing up data in Accumulo?
>>
>> Thanks for your thoughts!
>>
>> Mike
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB