Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - HBase Snapshots as backup solution


+
Samir Ahmic 2013-05-09, 08:11
Copy link to this message
-
Re: HBase Snapshots as backup solution
Matteo Bertozzi 2013-05-09, 08:55
Hi,

Could you describe a bit more what do you need by a backup, what do you
expect by it and what is your data flow?

A snapshot at the moment can guarantee just row level consistency.
This means that if you have in-flight writes some can be present in the
snapshot some not.
so, The best type of workload is the one that import data from somewhere.
In this case after a restore you can check which keys are present and
reimport the ones missing. The other case is the one where you don't care
if some rows are missing.

Snapshots doesn't create a copy of the data, so if you want to properly
save your data somewhere else you have to use ExportSnapshot, to copy the
data to another cluster. The main difference with CopyTable is that you
don't impact the RS during the export, since the operation is just a
filesystem level.

I think that snapshots are really good for testing stuff, you can take a
snapshot of a table.. clone a table from the snapshot and try to change
compression, schema or just play with the data, without impacting the main
table, and without have to copy petabyte of data.

Here there's a snapshot related blog post that tries to explain how the
feature work and what are some of the use cases
https://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/

Matteo

On Thu, May 9, 2013 at 9:11 AM, Samir Ahmic <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> We are using hbase-0.94.6.1 and at moment i'm evaluation Snapshots as
> backup solution for moving data between clusters. I'm  wondering if someone
> have similar experience and what are pros and cons ? Also is Snapshot
> future stable enough for this sort of operation ?
>
> Thanks,
> Samir
>