Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Re: HBase replication: "in order semantics"


+
Himanshu Vashishtha 2012-11-09, 18:01
+
Jan Van Besien 2012-11-12, 09:15
+
Jan Van Besien 2012-11-15, 07:54
Copy link to this message
-
Re: HBase replication: "in order semantics"
Yes. If there are failures in the source cluster the replicated data might be delivered out of order.
Note that you will never see partial rows applied; replication will enforce HBase's ACID guarantee here,
but rows might be delivered out of order, which means the ordering of deletes and puts might change, and hence lead to *temporary* visibility of deleted data. Eventually the state will be correct, though.
-- Lars

________________________________
 From: Jan Van Besien <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Wednesday, November 14, 2012 11:54 PM
Subject: Re: HBase replication: "in order semantics"
 
Hi all,

On 11/12/2012 10:15 AM, Jan Van Besien wrote:
> It does however still mean that it is possible to see rows on the
> replica in a state that never occurred on the original HBase cluster, in
> case put (or even delete) operations are not replicated in original
> order and a client is reading the "latest" state.

Can anybody confirm or deny my above statement? I would like to know whether my understanding of the replication feature is correct or not. Thanks.
Additionally, I do think there might still be another problem with the correctness guarantees of replication when edits arrive on the replica "out of order" (which can happen AFAIK when a region server moves and thus new edits for the same region are for a while potentially replicated in parallel with older entries of that region).

Say that a put was followed by a delete, but the edits get reordered on the replica, thus first issuing the delete, then the put.

Normally, it should be ok. The delete marker will be written on the replica and then the put, but the versions of the key/values in the put are older, thus the delete marker will still "win" when reading back the data.

However, what if in between the delete and the put, a major compaction happens. This major compaction cleans up the delete marker, so in that case, the put will "win". Is this a situation in which the replica will be incorrect?

Does this make sense? Or am I missing something completely?

Thanks in advance,
Jan
+
Jan Van Besien 2012-11-16, 11:18
+
lars hofhansl 2012-11-16, 17:24
+
Jan Van Besien 2012-11-09, 13:25
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB