-Re: HBase Replication Progress
Ted Yu 2013-11-12, 03:24
bq. consider to move to 0.94.6+
Moving to 0.94.7 or newer release is recommended.
On Mon, Nov 11, 2013 at 4:18 PM, Demai Ni <[EMAIL PROTECTED]> wrote:
> from your requirement, I think the 'snapshot' feature with export will work
> better. Here is some info:
> to fully benefit from this feature, you may consider to move to 0.94.6+
> I am still curiously about this hard requirement ".. The second map reduce
> job cannot start until all the data from Cluster A has been replicated to
> Cluster B....", consider the output of the first mapreduce job will be put
> into a HBase table of ClusterA. there is no need to wait till the
> replication complete, as long as use different rowID so the 2nd output
> wont' overwrite the 1st one. HBase replication will handle the situation
> very well.
> On Mon, Nov 11, 2013 at 4:03 PM, Kevin Su <[EMAIL PROTECTED]> wrote:
> > Hi,
> > I am having trouble searching for answers regarding HBase replication,
> so I
> > thought I would email the mailing list.
> > Does HBase provide an API/way to see what has/hasn't been replicated yet?
> > My use case is the following:
> > I run a map reduce job in Cluster A and stick the output in HBase. I
> > like to transport this output to Cluster B as (part of) the input to
> > another map reduce job. I hope to achieve this transport via HBase
> > replication. The second map reduce job cannot start until all the data
> > Cluster A has been replicated to Cluster B. So what is the best way to
> > check if everything has been replicated? Do I query Zookeeper and check
> > the RS queues are empty? Or is HBase replication not the right fit for my
> > use case?
> > I am using HBase 0.94.2.
> > Thanks in advance for any advice!
> > --
> > Kevin