Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> leader election, scheduled tasks, losing leadership


Copy link to this message
-
Re: leader election, scheduled tasks, losing leadership
> My point is that by the time that VM sees SUSPENDED/LOST, another VM may
> have been elected leader and have started running another process.
There's no way around this, right? ZK is not a transactional system so this edge-case is unsolvable.

> The way
> around the problem is to either ensure that no work is done by you once you
> are no longer the leader

You only release leadership when your work is done. If the cluster becomes unstable then you cancel your work. Leadership is denoted by a ZNode. Curator has a top-level watcher that notifies on cluster instability. How does the fence make this better?

-JZ

On Dec 8, 2012, at 9:30 PM, Henry Robinson <[EMAIL PROTECTED]> wrote:

> On 8 December 2012 21:18, Jordan Zimmerman <[EMAIL PROTECTED]>wrote:
>
>> If your ConnectionStateListener gets SUSPENDED or LOST you've lost
>> connection to ZooKeeper. Therefore you cannot use that same ZooKeeper
>> connection to manage a node that denotes the process is running or not.
>> Only 1 VM at a time will be running the process. That process can watch for
>> SUSPENDED/LOST and wind down the task.
>>
>>
> My point is that by the time that VM sees SUSPENDED/LOST, another VM may
> have been elected leader and have started running another process.
>
> It's a classic problem - you need some mechanism to fence a node that
> thinks its the leader, but isn't and hasn't got the memo yet. The way
> around the problem is to either ensure that no work is done by you once you
> are no longer the leader (perhaps by checking every time you want to do
> work), or that the work you do does not affect the system (e.g. by
> idempotent work units).
>
> ZK itself solves this internally by checking with that it has a quorum for
> every operation, which forces an ordering between the disconnection event
> and trying to do something that relies upon being the leader. Other systems
> forcibly terminate old leaders before allowing a new leader to take the
> throne.
>
> Henry
>
>
>>> You can't assume that the notification is received locally before another
>>> leader election finishes elsewhere
>> Which notification? The ConnectionStateListener is an abstraction on
>> ZooKeeper's watcher mechanism. It's only significant for the VM that is the
>> leader. Non-leaders don't need to be concerned.
>
>
>> -JZ
>>
>> On Dec 8, 2012, at 9:12 PM, Henry Robinson <[EMAIL PROTECTED]> wrote:
>>
>>> You can't assume that the notification is received locally before another
>>> leader election finishes elsewhere (particularly if you are running
>> slowly
>>> for some reason!), so it's not sufficient to guarantee that the process
>>> that is running locally has finished before someone else starts another.
>>>
>>> It's usually best - if possible - to restructure the system so that
>>> processes are idempotent to work around these kinds of problem, in
>>> conjunction with using the kind of primitives that Curator provides.
>>>
>>> Henry
>>>
>>> On 8 December 2012 21:04, Jordan Zimmerman <[EMAIL PROTECTED]
>>> wrote:
>>>
>>>> This is why you need a ConnectionStateListener. You'll get a notice that
>>>> the connection has been suspended and you should assume all
>> locks/leaders
>>>> are invalid.
>>>>
>>>> -JZ
>>>>
>>>> On Dec 8, 2012, at 9:02 PM, Henry Robinson <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> What about a network disconnection? Presumably leadership is revoked
>> when
>>>>> the leader appears to have failed, which can be for more reasons than a
>>>> VM
>>>>> crash (VM running slow, network event, GC pause etc).
>>>>>
>>>>> Henry
>>>>>
>>>>> On 8 December 2012 21:00, Jordan Zimmerman <[EMAIL PROTECTED]
>>>>> wrote:
>>>>>
>>>>>> The leader latch lock is the equivalent of task in progress. I assume
>>>> the
>>>>>> task is running in the same VM as the leader lock. The only reason the
>>>> VM
>>>>>> would lose leadership is if it crashes in which case the process would
>>>> die
>>>>>> anyway.
>>>>>>
>>>>>> -JZ
>>>>>>
>>>>>> On Dec 8, 2012, at 8:56 PM, Eric Pederson <[EMAIL PROTECTED]> wrote: