Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> I am curious how to find hbase replication lag


+
Alex Newman 2013-08-19, 19:42
+
gordoslocos 2013-08-19, 19:50
+
Ted Yu 2013-08-19, 20:04
+
Demai Ni 2013-08-19, 20:28
+
Vladimir Rodionov 2013-08-19, 21:00
Copy link to this message
-
Re: I am curious how to find hbase replication lag
Vladimir,

I have heard about your approach, kind of a column/timestamper marker, is
implemented by some companies. it is certainly a valid approach, and I am
also looking into this direction.

Just like to put a couple comments on the approach, which I am seeking to
improve:

1) need to manually create the column to hold the marker(i.e the canary_row)
2) need a thread(co-processor?) to put into the marker every second
3) I assume the marker is put in regardless the real data operations, then
it is 60rows, per minute, per table; If user have 100 tables being
replicated, it is 8.6million rows(60*60*24*100) a day. Not a big problem,
still a minor impact
4) better have a script to remove the marker periodically, as the
information will lose it value very soon(we are not to compare the marker
from yesterday's)

about 3), if the marker is put in only associated with real data
operations, how to address this scenario:
step1:  put row1 into t1 at 8:00am
step2:  no Edits on t1 at all for an hour
step3:  put row2  into t1 at 9:00am
step4:  between step 2 and step3, check the lag(for example at 8:59am).  Is
it a 59minute delay or actually no delay at all?

Again, I do agree that the timestamp marker is a good idea. But need to
make it easier to be applied by end-users

Demai
On Mon, Aug 19, 2013 at 2:00 PM, Vladimir Rodionov
<[EMAIL PROTECTED]>wrote:

>
> Just simple canary probe approach
>
> Update cluster1:t1:canary_row with a current time every (say) 1 sec
> Read time from cluster2:t1:canary_row every second
>
> Compute the difference
>
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [EMAIL PROTECTED]
>
> ________________________________________
> From: Demai Ni [[EMAIL PROTECTED]]
> Sent: Monday, August 19, 2013 1:28 PM
> To: [EMAIL PROTECTED]
> Subject: Re: I am curious how to find hbase replication lag
>
> Ted, thanks for connecting the two discussion. a topic that quite some
> folks are looking for solutions.
>
> Alex, as far as my study goes, there is no direct/easy way to get the lag
> info that is easily consumable by regular users(i.e. not a hbase expert).
> From user perspecitive, the lag can be either time or quantity. Although
> current hbase replication metrics contains a lot of good information, third
> party tool/monitor has to be applied (JMX, ganglia, etc.)  On 0.94 level,
> the metrics info is at regionsever level, on 95.0 such info also at peer
> cluster level. I don't think there is plan for a table-level yet.
>
> However, the use case is valid. For example: I am replicating table t1 from
> cluster M1 to cluster S1; how many seconds/minutes the lag is? Well, I
> haven't found a good solution yet. :-)
>
> Demai
>
>
> On Mon, Aug 19, 2013 at 1:04 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > This is related: http://search-hadoop.com/m/SrEIT1jtzPF
> >
> > Cheers
> >
> >
> > On Mon, Aug 19, 2013 at 12:50 PM, gordoslocos <[EMAIL PROTECTED]>
> > wrote:
> >
> > > I believe hbase keeps info in zk that gives you the count of pending
> > > operations to be replicated. Check into the rz zookeeper node in the
> > hbase
> > > replication documentation.
> > >
> > > http://hbase.apache.org/replication.html
> > >
> > >
> > > On 19/08/2013, at 16:42, Alex Newman <[EMAIL PROTECTED]> wrote:
> > >
> > > > I have setup HBase replication. I want to know how out of date my
> > > replicant
> > > > cluster is. How does one monitor that?
> > > >
> > > > -Alex Newman
> > >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB