Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase Cyclic Replication Issue: some data are missing in the replication for intensive write


Copy link to this message
-
Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write
Hi Himanshu:

Thanks for following up! I did looked up the log and there were some exceptions. I'm not sure if those exceptions contribute to the problem I've seen a week ago.
I did aware of the latency between the time that the master said "Nothing to replicate" and the actual time it takes to actually replicate on the slave. I remember I wait 12 hours for the replication to finish (i.e. start the test before leaving office and check the result the next day) and data still not fully replicated.

By the way, is your test running with master-slave replication or master-master replication?

I will resume this again. I was busy on something else for the past week or so.

Best Regards,

Jerry

On 2012-05-01, at 6:41 PM, Himanshu Vashishtha wrote:

> Hello Jerry,
>
> Did you try this again.
>
> Whenever you try next, can you please share the logs somehow.
>
> I tried replicating your scenario today, but no luck. I used the same
> workload you have copied here; master cluster has 5 nodes and slave
> has just 2 nodes; and made tiny regions of 8MB (memstore flushing at
> 8mb too), so that I have around 1200+ regions even for 200k rows; ran
> the workload with 16, 24 and 32 client threads, but the verifyrep
> mapreduce job says its good.
> Yes, I ran the verifyrep command after seeing "there is nothing to
> replicate" message on all the regionservers; sometimes it was a bit
> slow.
>
>
> Thanks,
> Himanshu
>
> On Mon, Apr 23, 2012 at 11:57 AM, Jean-Daniel Cryans
> <[EMAIL PROTECTED]> wrote:
>>> I will try your suggestion today with a master-slave replication enabled from Cluster A -> Cluster B.
>>
>> Please do.
>>
>>> Last Friday, I tried to limit the variability/the moving part of the replication components. I reduced the size of Cluster B to have only 1 regionserver and having Cluster A to replicate data from one region only without region splitting (therefore I have 1-to-1 region replication setup). During the benchmark, I moved the region between different regionservers in Cluster A (note there are still 3 regionservers in Cluster A). I ran this test for 5 times and no data were lost. Does it mean something? My feeling is there are some glitches/corner cases that have not been covered in the cyclic replication (or hbase replication in general). Note that, this happens only when the load is high.
>>
>> And have you looked at the logs? Any obvious exceptions coming up?
>> Replication uses the normal HBase client to insert the data on the
>> other cluster and this is what handles regions moving around.
>>
>>>
>>> By the way, why do we need to have a zookeeper not handled by hbase for the replication to work (it is described in the hbase documentation)?
>>
>> It says you *should* do it, not you *need* to do it :)
>>
>> But basically replication is zk-heavy and getting a better
>> understanding of it starts with handling it yourself.
>>
>> J-D
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB