Recently, we had an issue where our kafka brokers were shut down hard (and
so did not write out the clean shutdown file). Thus on restart, it went
through all logs and ran a recovery on them.
Unfortunately, this took a long time (on the order of 30 minutes). We have
a lot of topics (e.g. ~1000 or so). Is there anyway this can be done more
quickly, say in parallel?
Also, it be done as a background process, so the server can start up and
start receiving messages, logs for incoming topics are prioritized in the
recovery process, and perhaps messages can still be buffered in memory
while the log recovery is happening?
It seems onerous to block all activity for 30 minutes while a slow, serial,
recovery job happens....