Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> lost ZK events across datacenters


Copy link to this message
-
RE: lost ZK events across datacenters
Any state changes for the problem client between setting the watch and when you expected it to get called? Do you have logs for that client vs the others that show anything?

-----Original Message-----
From: Jun Rao [mailto:[EMAIL PROTECTED]]
Sent: Friday, June 03, 2011 4:40 AM
To: [EMAIL PROTECTED]
Subject: Re: lost ZK events across datacenters

Ben,

Some details below.

The call that sets the watcher simple calls getChildren with watcher flag
set to true. The triggering change is that one of the child nodes (which is
ephemeral) is deleted because the creating client is gone.

Thanks,

Jun

On Thu, Jun 2, 2011 at 10:49 AM, Benjamin Reed <[EMAIL PROTECTED]> wrote:

> can you tell us a bit more about the scenario? what was the call the
> set the watch event? and what were the changes that caused the event?
>
> thanx
> ben
>
> On Wed, Jun 1, 2011 at 3:14 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > All my clients were on different machines. 2 of them got the watcher
> fired
> > about the same time. The third one never got the watcher triggered.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Jun 1, 2011 at 2:18 PM, Fournier, Camille F. [Tech] <
> > [EMAIL PROTECTED]> wrote:
> >
> >> All clients are in different processes?
> >> I've used zkclient and haven't seen any problems, but I haven't hammered
> it
> >> too hard yet. I took a long look at the code and didn't see any errors
> but
> >> there could always be something very subtle.
> >>
> >> -----Original Message-----
> >> From: Jun Rao [mailto:[EMAIL PROTECTED]]
> >> Sent: Wednesday, June 01, 2011 4:09 PM
> >> To: [EMAIL PROTECTED]
> >> Subject: Re: lost ZK events across datacenters
> >>
> >> I am using the zkclient package (
> >> https://github.com/sgroschupf/zkclient.git).
> >> The watcher code seems reasonable. Basically, each watcher event is
> first
> >> added to a queue. A separate event thread dequeues each event and reads
> the
> >> children of a path (which re-registers the watcher) and invokes the
> >> registered listener.
> >>
> >> Anybody knows any issues in zkclient?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Wed, Jun 1, 2011 at 12:04 PM, Ted Dunning <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >> > This is most commonly due, in my own history of programming errors, to
> >> > writing code that has a race window in it.  It is conceivable that
> cross
> >> > data-center operation would make such a race more of a problem.
> >> >
> >> > Can you say a bit about your code?  Did you make sure to use standard
> >> > idioms
> >> > as opposed to setting the watch in a different call from reading the
> >> data?
> >> >
> >> > On Wed, Jun 1, 2011 at 11:40 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > I have a setup where multiple ZK clients are sitting in a different
> >> > > datacenter from the ZK server. All clients registered the same child
> >> > > watcher
> >> > > on a path. However, when the children of the path changed, the
> watcher
> >> on
> >> > 1
> >> > > of the clients didn't fire. This seems to have happened a couple of
> >> times
> >> > > to
> >> > > me. I am using ZK 3.3.3. Has anyone used ZK in a cross datacenter
> setup
> >> > and
> >> > > seen problems like that before?
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jun
> >> > >
> >> >
> >>
> >
>