Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Zookeeper on short lived VMs and ZOOKEEPER-107

Copy link to this message
Re: Zookeeper on short lived VMs and ZOOKEEPER-107
Alexander Shraer 2012-03-15, 15:33
yes, by replacing x at a time from 2x+1 you have quorum intersection.

i have one more question - zookeeper itself doesn't assume perfect
failure detection, which your scheme requires. what if the VM didn't
actually fail but just slow and then tries to reconnect ?

On Thu, Mar 15, 2012 at 2:50 AM, Christian Ziech
> I don't think that we could be running into a split brain problem in our use
> case.
> Let me try to describe the scenario we are worried about (assuming an
> ensemble of 5 nodes A,B,C,D,E):
> - The ensemble is up and running and in sync
> - Node A with the host name "zookeeperA.whatever-domain.priv" goes down
> because the VM has gone away
> - That removal of the VM is detected and a new VM is spawned with the same
> host name "zookeeperA.whatever-domain.priv" - let's call that node A'
> - Node A' zookeeper wants to join the cluster - right now this gets rejected
> by the others since A' has a different IP address than A (and the old one is
> "cached" in the InetSocketAddress of the QuorumPeer instance)
> We could ensure that at any given time there is only at most one node with
> host name "zookeeperA.whatever-domain.priv" known by the ensemble and that
> once one node is replaced, it would not come back. Also we could make sure
> that our ensemble is big enough to compensate for a replacement of more than
> x nodes at a time (setting it to x*2 + 1 nodes).
> So if I did not misestimate our problem it should be (due to the
> restrictions) simpler than the problem to be solved by zookeeper-107. My
> intention is basically by solving this smaller discrete problem to not need
> to wait for that zookeeper-107 makes it into a release (the assumption is
> that a smaller fix has a possibly a chance to make it into the 3.4.x branch
> even).
> Am 15.03.2012 07:46, schrieb ext Alexander Shraer:
>> Hi Christian,
>> ZK-107 would indeed allow you to add/remove servers and change their
>> addresses.
>> > We could ensure that we always have a more or less fixed quorum of
>> > zookeeper servers with a fixed set of host names.
>> You should probably also ensure that a majority of the old ensemble
>> intersects with a majority of the new one.
>> Otherwise you have to run a reconfiguration protocol similarly to ZK-107.
>> For example, if you have 3 servers A B and C, and now you're adding D and E
>> that replace B and C, how would this work ?  it is probable that D and E
>> don't have the latest state (as you mention) and A is down or doesn't have
>> the latest state too (a minority might not have the latest state). Also, how
>> do you prevent split brain in this case ? meaning B and C thinking that they
>> are still operational ? perhaps I'm missing something but I suspect that the
>> change you propose won't be enough...
>> Best Regards,
>> Alex
>> On Wed, Mar 14, 2012 at 10:01 AM, Christian Ziech
>> <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>    Just a small addition: In my opinion the patch could really boil
>>    down to add a
>>      quorumServer.electionAddr = new
>>      InetSocketAddress(electionAddr.getHostName(),
>>    electionAddr.getPort());
>>    in the catch(IOException e) clause of the connectOne() method of
>>    the QuorumCnxManager. In addition on should perhaps make the
>>    electionAddr field in the QuorumPeer.QuorumServer class volatile
>>    to prevent races.
>>    I haven't checked this change yet fully for implications but doing
>>    a quick test on some machines at least showed it would solve our
>>    use case. What do the more expert users / maintainers think - is
>>    it even worthwhile to go that route?
>>    Am 14.03.2012 17:04, schrieb ext Christian Ziech:
>>        LEt me describe our upcoming use case in a few words: We are
>>        planning to use zookeeper in a cloud were typically nodes come
>>        and go unpredictably. We could ensure that we always have a
>>        more or less fixed quorum of zookeeper servers with a fixed