Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Zookeeper on short lived VMs and ZOOKEEPER-107


Copy link to this message
-
Re: Zookeeper on short lived VMs and ZOOKEEPER-107
Actually its still not clear to me how you would enforce the 2x+1. In Zookeeper we can guarantee liveness (progress) only when x+1 are connected and up, however safety (correctness) is always guaranteed, even if 2 out of 3 servers are temporarily down. Your design needs the 2x+1 for safety, which I think is problematic unless you can accurately detect failures (synchrony) and failures are permanent.

Alex
On Mar 15, 2012, at 3:54 PM, Alexander Shraer <[EMAIL PROTECTED]> wrote:

> I think the concern is that the old VM can recover and try to
> reconnect. Theoretically you could even go back and forth between new
> and old VM. For example, suppose that you have servers
> A, B and C in the cluster, A is the leader. C is slow and "replaced"
> with C', then update U is acked by A and C', then A fails. In this
> situation you cannot have additional failures. But with the
> automatic replacement thing it can (theoretically) happen that C'
> becomes a little slow, C connects to B and is chosen as leader, and
> the committed update U is lost forever. This is perhaps unlikely but
> possible...
>
> Alex
>
> On Thu, Mar 15, 2012 at 1:35 PM,  <[EMAIL PROTECTED]> wrote:
>> I agree with your points about any kind of VMs having a hard to predict runtime behaviour and that participants of the zookeeper ensemble running on a VM could fail to send keep-alives for an uncertain amount of time. But I don't yet understand how that would break the approach I was mentioning: Just trying to re-resolve the InetAddress after an IO exception should in that case still lead to the same original IP address (and eventually to that node rejoining the ensemble).
>> Only if that host name (the old node was using) would be re-assigned to another instance this step of re-resolving would point to a new IP (and hence cause the old server to be replaced).
>>
>> Did I understand your objection correctly?
>>
>> ________________________________________
>> Von: ext Ted Dunning [[EMAIL PROTECTED]]
>> Gesendet: Donnerstag, 15. März 2012 19:50
>> Bis: [EMAIL PROTECTED]
>> Cc: [EMAIL PROTECTED]
>> Betreff: Re: Zookeeper on short lived VMs and ZOOKEEPER-107
>>
>> Alexander's comment still applies.
>>
>> VM's can function or go away completely, but they can also malfunction
>> in more subtle ways such that they just go VEEEERRRRY slowly.  You
>> have to account for that failure mode.  These failures can even be
>> transient.
>>
>> This would probably break your approach.
>>
>> On 3/15/12, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>>> Oh sorry there is a slight misunderstanding. With VM I did not mean the java
>>> vm but the Linux vm that contains the zookeeper node. We get notified if
>>> that goes away and is repurposed.
>>>
>>> BR
>>>  Christian
>>>
>>> Gesendet von meinem Nokia Lumia 800
>>> ________________________________
>>> Von: ext Alexander Shraer
>>> Gesendet: 15.03.2012 16:33
>>> An: [EMAIL PROTECTED]; Ziech Christian (Nokia-LC/Berlin)
>>> Betreff: Re: Zookeeper on short lived VMs and ZOOKEEPER-107
>>>
>>> yes, by replacing x at a time from 2x+1 you have quorum intersection.
>>>
>>> i have one more question - zookeeper itself doesn't assume perfect
>>> failure detection, which your scheme requires. what if the VM didn't
>>> actually fail but just slow and then tries to reconnect ?
>>>
>>> On Thu, Mar 15, 2012 at 2:50 AM, Christian Ziech
>>> <[EMAIL PROTECTED]> wrote:
>>>> I don't think that we could be running into a split brain problem in our
>>>> use
>>>> case.
>>>> Let me try to describe the scenario we are worried about (assuming an
>>>> ensemble of 5 nodes A,B,C,D,E):
>>>> - The ensemble is up and running and in sync
>>>> - Node A with the host name "zookeeperA.whatever-domain.priv" goes down
>>>> because the VM has gone away
>>>> - That removal of the VM is detected and a new VM is spawned with the same
>>>> host name "zookeeperA.whatever-domain.priv" - let's call that node A'
>>>> - Node A' zookeeper wants to join the cluster - right now this gets