We're running HBase replication successfully on a 500 TB (compressed - raw
is about 2PB) cluster over a 60ms link across the country. I'd give it a
thumbs up for dealing with loss of a cluster and being able to run
applications in two places that can tolerate inconsistency from the
asynchronous nature. ( http://hbase.apache.org/replication.html )
You'll still want some sort of snapshot / export to be able to recover from
bugs / corruption which gets replicated. We're intending to try out hbase
snapshots ( http://hbase.apache.org/book/ops.snapshots.html ) once we've
deployed a release with support.
I'd also recommend using a recent 0.94 release if possible.
On Thu, Oct 17, 2013 at 12:52 PM, hdev ml <[EMAIL PROTECTED]> wrote:
> Hello all,
> We are looking at a solution for HBase backup, recovery and replication for
> We did take a look at the HBase replication, but we are not sure whether it
> is being used at large.
> Our data size in HBase is around 4TB.
> We were thinking of DB approach of Exporting Full Dump weekly and then
> doing incremental exports on regular intervals, say around 2-3 times a day.
> But soon realized that the data transfer of 4 TB to our DR site, with our
> current bandwidth, will take around 100+ hours.
> Are there better solutions out there? What do large installations do?
> Any documentation?
> Please let me know