Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> leader election, scheduled tasks, losing leadership


Copy link to this message
-
Re: leader election, scheduled tasks, losing leadership
This is why you need a ConnectionStateListener. You'll get a notice that the connection has been suspended and you should assume all locks/leaders are invalid.

-JZ

On Dec 8, 2012, at 9:02 PM, Henry Robinson <[EMAIL PROTECTED]> wrote:

> What about a network disconnection? Presumably leadership is revoked when
> the leader appears to have failed, which can be for more reasons than a VM
> crash (VM running slow, network event, GC pause etc).
>
> Henry
>
> On 8 December 2012 21:00, Jordan Zimmerman <[EMAIL PROTECTED]>wrote:
>
>> The leader latch lock is the equivalent of task in progress. I assume the
>> task is running in the same VM as the leader lock. The only reason the VM
>> would lose leadership is if it crashes in which case the process would die
>> anyway.
>>
>> -JZ
>>
>> On Dec 8, 2012, at 8:56 PM, Eric Pederson <[EMAIL PROTECTED]> wrote:
>>
>>> If I recall correctly it was Henry Robinson that gave me the advice to
>> have
>>> a "task in progress" check.
>>>
>>>
>>> -- Eric
>>>
>>>
>>>
>>> On Sat, Dec 8, 2012 at 11:54 PM, Eric Pederson <[EMAIL PROTECTED]>
>> wrote:
>>>
>>>> I am using Curator LeaderLatch :)
>>>>
>>>>
>>>> -- Eric
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Dec 8, 2012 at 11:52 PM, Jordan Zimmerman <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> You might check your leader implementation. Writing a correct leader
>>>>> recipe is actually quite challenging due to edge cases. Have a look at
>>>>> Curator (disclosure: I wrote it) for an example.
>>>>>
>>>>> -JZ
>>>>>
>>>>> On Dec 8, 2012, at 8:49 PM, Eric Pederson <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Actually I had the same thought and didn't consider having to do this
>>>>> until
>>>>>> I talked about my project at a Zookeeper User Group a month or so ago
>>>>> and I
>>>>>> was given this advice.
>>>>>>
>>>>>> I know that I do see leadership being lost/transferred when one of the
>>>>> ZK
>>>>>> servers is restarted (not the whole ensemble).   And it seems like
>> I've
>>>>>> seen it happen even when the ensemble stays totally stable (though I
>> am
>>>>> not
>>>>>> 100% sure as it's been a while since I have worked on this particular
>>>>>> application).
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Dec 8, 2012 at 11:25 PM, Jordan Zimmerman <
>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>> Why would it lose leadership? The only reason I can think of is if
>> the
>>>>> ZK
>>>>>>> cluster goes down. In normal use, the ZK cluster won't go down (I
>>>>> assume
>>>>>>> you're running 3 or 5 instances).
>>>>>>>
>>>>>>> -JZ
>>>>>>>
>>>>>>> On Dec 8, 2012, at 8:17 PM, Eric Pederson <[EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>>> During the time the task is running a cluster member could lose its
>>>>>>>> leadership.
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>
>>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB