Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> curator leader reconnect


Copy link to this message
-
Re: curator leader reconnect
Jordan, thanks for looking into this.

I cloned the code and had a look. For me your test case covers, that you
get the leadership again, after the RECONNECT happens. This is also the
case in my code.
But how does it check, that there is a related lock/ephemeral node in the
ZK-Cluster? Which is not the case for me.

I made some debugging:
If the connection is lost in InterProcessMutex.release() the releaseLocks
call will throw an exception, right?
So the lockData is not(!) set to null (line#130).
When the InterProcessMutex.aquire() is the called after the RECONNECT, it
is considered as "re_entering".
So the lock is just granted, without redoing the lock in the ZK-cluster.
This seems not ok for me.
But i'm the newbie here.

Would be great if you can have a look.

/Hartmut

Am 7. Februar 2012 09:05 schrieb Jordan Zimmerman <[EMAIL PROTECTED]>:

> I just pushed a test that simulates the situation you describe and it
> works correctly. Can you please have a look at it and see what's different
> about your case?
>
> TestLeaderSelectorCluster.java
>    testLostRestart()
> ________________________________________
> From: Hartmut Lang [[EMAIL PROTECTED]]
> Sent: Monday, February 06, 2012 9:55 PM
> To: [EMAIL PROTECTED]
> Subject: Re: curator leader reconnect
>
> Well i use the CLI-client to connect to the ZK-Cluster. And i see now
> entry.
>
> My setup:
> I have a cluster of three ZK-nodes.
> I have a client starting LeaderSelector, which is connected to one
> cluster-node.
> I see the ephemeral node.
>
> I stop the  cluster-node the client is connected to. The client finally
> sees a LOST event. The ephemeral node is gone (using CLI).
> I start the cluster-node again. Client sees the RECONNECT and calls
> start(). And then takeLeaderShip() is called.
> But no ephemeral node in the cluster.
>
> /Hartmut
>
>
> Am 6. Februar 2012 18:46 schrieb Jordan Zimmerman <[EMAIL PROTECTED]
> >:
>
> > How are you verifying that there is no ephemeral node?
> >
> > -Jordan
> >
> > On 2/6/12 9:28 AM, "Hartmut Lang" <[EMAIL PROTECTED]> wrote:
> >
> > >Hi Jordan,
> > >
> > >thanks for your infos.
> > >What i see in my LeaderSelector example is this:
> > >when i just call the start() method after RECONNECT, the
> takeLeadership()
> > >method is called again.
> > >But no ephemeral node does exist in the ZK-Cluster for my client. So
> this
> > >seems not to be right.
> > >What could i do wrong?
> > >
> > >/Hartmut
> > >Am 6. Februar 2012 07:55 schrieb Jordan Zimmerman
> > ><[EMAIL PROTECTED]>:
> > >
> > >> No - don't call close. I'm afraid that it's a bit confusing. It was an
> > >> afterthought. Maybe I should add a restart() method or something.
> > >>
> > >> -JZ
> > >>
> > >> On 2/5/12 10:48 PM, "Hartmut Lang" <[EMAIL PROTECTED]>
> wrote:
> > >>
> > >> >Thanks for your answer.
> > >> >If i call start() again on the same instance, should i call close()
> > >> >before?
> > >> >
> > >> >My first attempt was to call close() on the LeaderSelector on a
> > >> >LOST-Event.
> > >> >Well then of course i do not get again the RECONNECT event on the
> > >> >LeaderSelectorListener.
> > >> >
> > >> >/Hartmut
> > >> >
> > >> >Am 5. Februar 2012 23:53 schrieb Jordan Zimmerman
> > >> ><[EMAIL PROTECTED]>:
> > >> >
> > >> >> You can either create a new LeaderSelector or call start() again on
> > >>your
> > >> >> existing leader instance. Whatever's easier for your use-case.
> > >> >>
> > >> >> -Jordan
> > >> >>
> > >> >> On 2/5/12 8:09 AM, "Hartmut Lang" <[EMAIL PROTECTED]>
> > >>wrote:
> > >> >>
> > >> >> >Hi,
> > >> >> >
> > >> >> >i work on a small demo application using the Curator
> > >>Leader-Election.
> > >> >> >What i understand from the wiki is that on a connection
> LOST-event,
> > >>the
> > >> >> >leader should end his takeLeadership method.
> > >> >> >
> > >> >> >But what should be done with the LeaderSelector instance?
> > >> >> >Should this be also closed on a LOST-event? Or can it be re-used,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB