I am using Zookeeper3.3.5 for a distributed project. During the test, a
watch related issue is found. Our monitor program places 100 watches on 100
different paths (e.g. /goo1 …. /goo100) for monitoring the data change, and
another writer program updates one of paths at a specified interval. We
found sometimes some data change notification messages are lost when the
monitor program is switched to a new server due to the failure of current
I check the “watch management” section in current release notes
http://zookeeper.apache.org/doc/trunk/releasenotes.html and find a statement
“In this release the client library tracks watches that a client has
registered and reregisters the watches when a connection is made to a new
server.” So based on the information, look like during server failover it is
expected behavior to lose data change notifications before watches are
successfully re-registered in a new server.
The solution that I figure out to this issue is to query all 100 paths to
check if there is any data change after the monitor program is connected to
a new server.
However if we need to monitor 1000 or 10K paths, this solution may not be
good. Can anyone suggest a better solution to this issue?
Furthermore, can ZK service is enhanced to replicate the watches on each ZK
server to solve this issue forever?
Thanks for your time and help!
View this message in context: http://zookeeper-user.578899.n2.nabble.com/Data-change-notification-is-lost-during-failover-tp7577729.html
Sent from the zookeeper-user mailing list archive at Nabble.com.