Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Data change notification is lost during failover


Copy link to this message
-
Re: Data change notification is lost during failover
On Tue, Jul 24, 2012 at 6:08 AM, Jack Luo <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I am using Zookeeper3.3.5 for a distributed project. During the test, a
> watch related issue is found. Our monitor program places 100 watches on 100
> different paths (e.g. /goo1 …. /goo100) for monitoring the data change, and
> another writer program updates one of paths at a specified interval. We
> found sometimes some data change notification messages are lost when the
> monitor program is switched to a new server due to the failure of current
> server.
>
> I check the “watch management” section in current release notes
> http://zookeeper.apache.org/doc/trunk/releasenotes.html and find a statement
> “In this release the client library tracks watches that a client has
> registered and reregisters the watches when a connection is made to a new
> server.” So based on the information, look like during server failover it is
> expected behavior to lose data change notifications before watches are
> successfully re-registered in a new server.
>

See the programmer's guide here:
http://zookeeper.apache.org/doc/r3.3.5/zookeeperProgrammers.html#ch_zkWatches

"When a client reconnects, any previously registered watches will be
reregistered and triggered if needed. In general this all occurs
transparently. There is one case where a watch may be missed: a watch
for the existance of a znode not yet created will be missed if the
znode is created and deleted while disconnected."

so really you should not lose any notifications in this case.

> The solution that I figure out to this issue is to query all 100 paths to
> check if there is any data change after the monitor program is connected to
> a new server.
>
> However if we need to monitor 1000 or 10K paths, this solution may not be
> good. Can anyone suggest a better solution to this issue?
>
> Furthermore, can ZK service is enhanced to replicate the watches on each ZK
> server to solve this issue forever?
>

The client maintains the zxid of the last change it saw from the
server. When it re-registers it will be notified of any changes since
that zxid. So really this is already supported. Sounds like a bug to
me, but I've not heard of any such issues from our users.

Patrick
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB