Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Getting confused with the "recipe for lock"


Copy link to this message
-
Re: Getting confused with the "recipe for lock"
Thanks!

I do agree with you that Client1 will eventually know that the lock is
invalid, by tracking disconnection and time.

But,

1. Time can not by precisely synchronized between servers; it is likely
that client1 will detect session timeout (by its timer thread),  after
server treats client1's session as timeouted and Client2 thinks itself
holding the lock.

so,  within a small time gap, more than one client may believe themselves
holding the lock.

2. thus , the protocol of lock can still not guarantee exclusiveness;  is
it ... er... broken ?

On Fri, Jan 11, 2013 at 10:48 PM, Andrey Stepachev <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Yes, this scenario is very likely.
> But it will work only for long running tasks (more then session timeout),
> for short livinig tasks lock will be unlocked before session timeout,
> surely.
>
> In case of long living locks, Client1 should track disconnection from zk
> cluster and assume, that lock was abandoned (and somehow notify lock owner
> about that). Client can know value of session timeout and spawn timer, and
> action accordingly program logic. As example it can interrupt thread, which
> created lock, and rise some flag, so long running task can know - lock is
> not valid.
>
>
> On Fri, Jan 11, 2013 at 5:46 PM, Zhao Boran <[EMAIL PROTECTED]> wrote:
>
> > While reading the zookeeper's recipe for
> > lock<http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks
> >,
> > I get confused:
> >
> > Seems that this recipe-for-distributed-lock can not guarantee *"any
> > snapshot in time no two clients think they hold the same lock"*.
> >
> > But since zookeeper is so widely adopted, if there were such mistakes in
> > the reference doc, someone should have pointed it out long time ago.
> >
> > So, what did I misunderstand? please help me!
> >
> > Recipe-for-distributed-lock (from
> > http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks)
> >
> > Locks
> >
> > Fully distributed locks that are globally synchronous, *meaning at any
> > snapshot in time no two clients think they hold the same lock*. These can
> > be implemented using ZooKeeeper. As with priority queues, first define a
> > lock node.
> >
> >    1. Call create( ) with a pathname of "*locknode*/guid-lock-" and the
> >    sequence and ephemeral flags set.
> >    2. Call getChildren( ) on the lock node without setting the watch flag
> >    (this is important to avoid the herd effect).
> >    3. If the pathname created in step 1 has the lowest sequence number
> >    suffix, the client has the lock and the client exits the protocol.
> >    4. The client calls exists( ) with the watch flag set on the path in
> the
> >    lock directory with the next lowest sequence number.
> >    5. if exists( ) returns false, go to step 2. Otherwise, wait for a
> >    notification for the pathname from the previous step before going to
> > step 2.
> >
> > Considering the following case:
> >
> >    -
> >
> >    Client1 successfully acquired the lock(in step3), with zk node
> >    "locknode/guid-lock-0";
> >    -
> >
> >    Client2 created node "locknode/guid-lock-1", failed to acquire the
> lock,
> >    and watching "locknode/guid-lock-0";
> >    -
> >
> >    Later, for some reasons(network congestion?), client1 failed to send
> >    heart beat message to zk cluster on time, but client1 is still
> perfectly
> >    working, and assuming itself still holding the lock.
> >    -
> >
> >    But, Zookeeper may think client1's session is timeouted, and then
> >    1. deletes "locknode/guid-lock-0"
> >       2. sends a notification to Client2 (or send the notification
> first?)
> >       3. but can not send "session timeout" notification to client1 in
> time
> >       (due to network congestion?)
> >
> >
> >    -
> >
> >    Client2 got the notification, goes to step 2, gets the only node
> >    ""locknode/guid-lock-1", which is created by itself; thus, client2
> > assumes
> >    it hold the lock.
> >    -
> >
> >    But at the same time, client1 assumes it hold the lock.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB