I think I know the answer to this already but I wanted to check my assumptions before proceeding.
We are using Kafka as a queueing mechanism for receiving messages from stateless producers. We are operating in a legal framework where we can never lose a committed message, but we can reject a write if Kafka is unavailable and it will be retried in the future. We are operating all of our servers in one rack so we are vulnerable if a whole rack goes out. We will have 3-4 Kafka brokers and have RF=3
To guarantee that we never (to the greatest extent possible) lose a message that we have acknowledged, it seems like we need to have request.required.acks=-1 and log.flush.interval.messages = 1, i.e. fsync on every message and wait for all brokers in ISR to reply before returning successfully. This would guard against the failure scenario where all servers in our rack go down simultaneously.
Is my understanding correct?