0.8 best practices for migrating / electing leaders in failure situations?
What would the recommended practice be for the following scenarios?
Running on EC2, ephemperal disks only for kafka.
There are 3 kafka servers. The broker ids are always increasing. If a
broker dies its never coming back.
All topics have a replication factor of 3.
* Scenario 1: BrokerID 1,2,3 Broker 2 dies.
Boot another: BrokerID 4
?? run bin/kafka-reassign-partitions.sh for any topic+partition and
replace brokerid 2 with brokerid 4
?? anything else to do to cause messages to be replicated to 4??
NOTE: This appears to work but not positive 4 got messages replicated to it.
* Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
Messages obviously lost.
Recover to a functional state by:
Boot 3 more: 4,5 6
?? run bin/kafka-reassign-partitions.sh for all topics/partitions, swap
1,2,3 for 4,5,6?
?? rin bin/kafka-preferred-replica-election.sh for all topics/partitions
?? anything else to do to allow producers to start sending successfully??
NOTE: I had some trouble with scenario 2. Will try to reproduce and open a
ticket, if in fact my procedures for scenario 2 are correct, and I still
cant get to a good state.