Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Broker rejoin with big replica lag


Copy link to this message
-
Re: Broker rejoin with big replica lag
Totally worked, thanks all.
On Feb 5, 2014, at 5:18 PM, Andrew Otto <[EMAIL PROTECTED]> wrote:

>> - Increasing num.replica.fetchers (defaults is one)
> Awesome!  I just tried this one, bumped it up to 8 (12 cores on this broker box).  It is now catching up at around 17K msgs/sec, which will mean it will finish in about 4 or 5 hours.  I’ll check up on it again tomorrow.
>
> That should do it,  Thanks!
>
>
>
> On Feb 5, 2014, at 5:04 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
>>
>>> topics are all caught up, but I have one high volume topic (around
>>> 40K msgs/sec) that is taking much longer.  I just took a few samples
>>> of Replica-MaxLag to see how long it would take to catch up.
>>> Currently, it is behind about 12.5 million messages and is catching
>>> up at a rate of about 1600 msgs/sec.  At that rate, it’ll take
>>> around 9 days before the replica is caught up to the leader.
>>>
>>> Is there any way to speed this up?
>>
>> During the period your high-volume topic is under-replicated you can
>> temporarily try one or both of the following:
>> - Increasing num.replica.fetchers (defaults is one)
>> - If you don't have too many topic-partitions you can also increase
>> replica.fetch.max.bytes.
>>
>>> Or, alternatively, I don’t actually care about this topic’s
>>> history.  It is a new topic, and I know that it doesn't yet have any
>>> consumers.  I’d be fine with instructing both brokers to drop
>>> old logs and just start from the top of the log.  I could do this by
>>> manually deleting the topic (kafka data files and in zookeeper), but
>>> to do so properly with 0.8.0 I think I’d have to shut down the
>>> whole cluster, correct?  I’d rather not do this, as another
>>> topic does have a consumer and I don’t want to lose messages for
>>> it.
>>
>> Right - or you could do a rolling bounce and change the retention
>> settings (http://kafka.apache.org/documentation.html#brokerconfigs) of
>> that topic to something low so it gets expired and then do another
>> rolling bounce to remove the override.
>>
>> --
>> Joel
>