Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - slow log recovery


Copy link to this message
-
Re: slow log recovery
Jun Rao 2013-05-07, 04:32
In 0.8, if you turn on replication, it may not matter too much if a broker
takes long to start up since data can still be served from the replicas. It
may be possible to improve this by maintaining a flush checkpoint file on
disk. We can then use that info to reduce the amount of the data to be
recovered.

Thanks,

Jun
On Mon, May 6, 2013 at 3:07 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Recently, we had an issue where our kafka brokers were shut down hard (and
> so did not write out the clean shutdown file).  Thus on restart, it went
> through all logs and ran a recovery on them.
>
> Unfortunately, this took a long time (on the order of 30 minutes).  We have
> a lot of topics (e.g. ~1000 or so).  Is there anyway this can be done more
> quickly, say in parallel?
>
> Also, it be done as a background process, so the server can start up and
> start receiving messages, logs for incoming topics are prioritized in the
> recovery process, and perhaps messages can still be buffered in memory
> while the log recovery is happening?
>
> It seems onerous to block all activity for 30 minutes while a slow, serial,
> recovery job happens....
>
> Thoughts?
>
> Jason
>