D and E can only become a leader if they can form a quorum with B and/or C
(assuming you've taken A down). In the case A, D, E are the only 3 servers
able to talk to each other, killing A is going to make you lose quorum, and
if you bring A back up, it might be elected again, so you might end up in
this cycle until the net partition is healed or B and/or C are up.
I suppose we could have an option that tells servers to ignore a given
server when electing a leader, but it is not entirely trivial because A in
the example Ben gave might be the only available server that has the latest
committed txn. We would need a mechanism to transfer state from a follower
to a prospective leader.
> -----Original Message-----
> From: Benjamin Reed [mailto:[EMAIL PROTECTED]]
> Sent: 25 November 2013 01:21
> To: [EMAIL PROTECTED]
> Subject: Re: Disqualify a node from leader election
> camille really has the right solution. you have to let it become the
> then kill it. here is why:
> lets says you have servers: A, B, C, D, and E and A is the node that you
> want to be the leader. let's also say that C is a leader and commits
> x on A, B, and C but before D and E get x B and C fail. now A is the only
> surviving node with x, so unless A can become the leader, at least
> temporarily, D and E will never get x. if you follow Camille's suggestion,
> let A become the leader (since it has the most recent
> transaction) and D and E will sync with A and get x. now if you restart A,
> leader election will run again and D or E can be elected leader. this does
> show that you want the less desirable nodes to have lower server ids so
> there will be less chance of them becoming leader if there is a "tie"
> nodes are just as uptodate as them).
> On Sun, Nov 24, 2013 at 5:09 PM, Owen Kim <[EMAIL PROTECTED]> wrote:
> > Observers can't vote in leader election, though right? I'm not sure
> > the loss of fault tolerance would be worth it.
> > The scenario is that I have a 5-node cluster but one node is in an
> > network partition that gets DOSed around it. When this happens and
> > it's leader, I see huge performance degradation. The real solution is
> > obviously to move the node off that network but I wondered if there
> > was an easy way to keep it from being leader in its configuration.