I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
(better hardware). I'm using a replication factor of 2.
I'm thinking the plan should be to spin up the 3 new nodes, and operate as
a 5 node cluster for a while. Then first remove 1 of the old nodes, and
wait for the partitions on the removed node to get replicated to the other
nodes. Then, do the same for the other old node.
Does this sound sensible?
How does the cluster decide when to re-replicate partitions that are on a
node that is no longer available? Does it only happen if/when new messages
arrive for that partition? Is it on a partition by partition basis?
Or is it a cluster-level decision that a broker is no longer valid, in
which case all affected partitions would immediately get replicated to new
brokers as needed?
I'm just wondering how I will know when it will be safe to take down my
second old node, after the first one is removed, etc.