ms209495 2013-11-26, 08:54
Camille Fournier 2013-11-26, 20:10
ms209495 2013-11-26, 22:28
Cameron McKenzie 2013-11-26, 22:34
Alexander Shraer 2013-11-26, 22:58
Ted Dunning 2013-11-27, 00:53
Cameron McKenzie 2013-11-27, 00:57
Alexander Shraer 2013-11-27, 01:20
Ted Dunning 2013-11-28, 05:14
Maciej 2013-11-26, 23:17
Cameron McKenzie 2013-11-26, 23:41
Bryan Thompson 2013-11-26, 23:27
Since there is always a period between checking you are master and
performing an action as master, so can't guarantee that another node
hasn't taken mastership before you perform the action. However, if you
are using some shared storage for the state of the system, you can
block other masters from writing to it while you are master,
preventing split brain from occurring.
The simplest way to do this is to store all your state in
zookeeper. With this, if B is partitioned away, it will not be able to
update the state. This won't scale too far though, as zk holds
everything in memory. It may work if the system isn't too big though.
Another solution is to use
Bookkeeper(http://zookeeper.apache.org/bookkeeper), or another shared
storage with fencing, to sequence the updates to your
state. Bookkeeper is a distributed write ahead log, which writes
entries to a quorum before responding to client. It has a fencing
mechanism which sends a 'fence' message to at least one node in each
quorum, blocking all further writes to that log. In your case, if a
node in B is master and is putting all state updates into a bookkeeper
log before applying them, and the there is a partition and a node in A
becomes master, A will fence B's log before applying any state updates
of it's own.
Yet another solution, though I don't know how well it would work, if
to use locks in NFS. If B is logging to a file on a SAN, it get
an exclusive lock on the file handle. This will block anyone else from
logging to it until B's NFS session goes away. I'm not sure how long
it takes for sessions to timeout though, or how widely implemented or
reliable this part of the NFS spec is though.
Hope this helps,
On Tue, Nov 26, 2013 at 12:54:44AM -0800, ms209495 wrote:
> ZooKeeper is an excellent system but the problem with ensuring only one
> master among clients bothers me.
> Lets have a look at the situation when network partition happen: there is
> part A (majority), and part B (minority).
> Lets assume that before network partition happened the master was connected
> to part B.
> After the network partition, part A will elect new ZooKeeper leader, and
> there will be new master elected among clients connected to part A.
> At this time there are two masters - old in part B, and new in part A.
> The only solution I can think about to this problem, is to ensure that the
> new master is inactive for some time - to ensure that the old master in this
> time will detect that it is not connected to ZooKeeper quorum, and will
> deactivate itself as a master.
> This solution assumes that timers on these machines work correctly.
> Is it possible to ensure only one master using ZooKeeper without timing
> assumptions ?
> View this message in context: http://zookeeper-user.578899.n2.nabble.com/Ensure-there-is-one-master-tp7579367.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.