Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Broker rejoin with big replica lag


Copy link to this message
-
Re: Broker rejoin with big replica lag
Jay Kreps 2014-02-05, 23:42
Do we have the right default?

-Jay
On Wed, Feb 5, 2014 at 2:04 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:

>
> > topics are all caught up, but I have one high volume topic (around
> > 40K msgs/sec) that is taking much longer.  I just took a few samples
> > of Replica-MaxLag to see how long it would take to catch up.
> > Currently, it is behind about 12.5 million messages and is catching
> > up at a rate of about 1600 msgs/sec.  At that rate, it'll take
> > around 9 days before the replica is caught up to the leader.
> >
> > Is there any way to speed this up?
>
> During the period your high-volume topic is under-replicated you can
> temporarily try one or both of the following:
> - Increasing num.replica.fetchers (defaults is one)
> - If you don't have too many topic-partitions you can also increase
>   replica.fetch.max.bytes.
>
> > Or, alternatively, I don't actually care about this topic's
> > history.  It is a new topic, and I know that it doesn't yet have any
> > consumers.  I'd be fine with instructing both brokers to drop
> > old logs and just start from the top of the log.  I could do this by
> > manually deleting the topic (kafka data files and in zookeeper), but
> > to do so properly with 0.8.0 I think I'd have to shut down the
> > whole cluster, correct?  I'd rather not do this, as another
> > topic does have a consumer and I don't want to lose messages for
> > it.
>
> Right - or you could do a rolling bounce and change the retention
> settings (http://kafka.apache.org/documentation.html#brokerconfigs) of
> that topic to something low so it gets expired and then do another
> rolling bounce to remove the override.
>
> --
> Joel
>