Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> slow log recovery


+
Jason Rosenberg 2013-05-06, 22:07
+
Jun Rao 2013-05-07, 04:32
Copy link to this message
-
Re: slow log recovery
Will producers also be able to start sending new messages to a replica,
while one broker is taking a long time to startup?
On Mon, May 6, 2013 at 9:31 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> In 0.8, if you turn on replication, it may not matter too much if a broker
> takes long to start up since data can still be served from the replicas. It
> may be possible to improve this by maintaining a flush checkpoint file on
> disk. We can then use that info to reduce the amount of the data to be
> recovered.
>
> Thanks,
>
> Jun
>
>
> On Mon, May 6, 2013 at 3:07 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
>
> > Recently, we had an issue where our kafka brokers were shut down hard
> (and
> > so did not write out the clean shutdown file).  Thus on restart, it went
> > through all logs and ran a recovery on them.
> >
> > Unfortunately, this took a long time (on the order of 30 minutes).  We
> have
> > a lot of topics (e.g. ~1000 or so).  Is there anyway this can be done
> more
> > quickly, say in parallel?
> >
> > Also, it be done as a background process, so the server can start up and
> > start receiving messages, logs for incoming topics are prioritized in the
> > recovery process, and perhaps messages can still be buffered in memory
> > while the log recovery is happening?
> >
> > It seems onerous to block all activity for 30 minutes while a slow,
> serial,
> > recovery job happens....
> >
> > Thoughts?
> >
> > Jason
> >
>

 
+
Jun Rao 2013-05-07, 14:55
+
Jason Rosenberg 2013-05-07, 16:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB