You at least have to retain data long enough for the followers to be able
to copy it from the leader. As Jun mentioned, you can observe the max lag
follower and set replica.lag.max.messages to be a little higher than that.
The high watermark is incremented to point to the logical offset of the
latest message that has been copied to all the replicas in the ISR. Once
the high watermark includes a certain message, it is considered
"committed". So it is not configurable.
On Tue, Sep 10, 2013 at 8:26 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> In a stress test, 100K 1Mb messages (100Gb in size) are published (our
> bandwidth is limited).
> As our retention size is 3G which is smaller than the required default
> minimum retention size
> (4G), we noticed 20K messages were missing. After increasing
> "num.replica.fetchers" to 2,
> no more message loss.
> So how much messages in one high watermark? And is it configurable? I went
> the broker configuration page and found nothing.
> -----Original Message-----
> From: Jun Rao [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, September 10, 2013 11:00 AM
> To: [EMAIL PROTECTED]
> Subject: Re: implicit default minimum retention size per partition is 4GB.
> Monitoring the lag in bytes makes sense. The only difficulty is currently,
> the high watermark in the leader is represented in logical message offset,
> not the byte offset. For now, you will have to do the bytes to messages
> translation yourself.
> As for setting replica.lag.max.messages, you can observe the max lag in
> the follower and set replica.lag.max.messages to be a bit larger than that.
> I am curious to know the observed max lag in your use case.
> On Tue, Sep 10, 2013 at 6:46 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > Hi team,
> > For default broker configuration, replica.lag.max.messages is 4000 and
> > message.max.bytes is 1Mb.
> > In the extreme case, the follower(s) could lag by 4000 messages. The
> > leader must save at least
> > 4000 messages to allow follower(s) to catch up. So the minimum
> > retention size is 4000Mb=4Gb.
> > It is better to add this to the documentation.
> > In our case, message.max.bytes is much larger than 1Mb and
> > replica.lag.max.messages is larger than 4000. This implicit minimum
> > retention size is hundreds of Gb and we have hundreds of partitions on
> > each broker. We feel we have to use a disk array to run Kafka.
> > Because the topics have different maximum message size, it makes more
> > sense to use the size gap between the leader and follower(s), e.g.,
> > say the follower(s) can only lag behind the leader by 2Gb.
> > This makes it easier to control the behavior the brokers and save disk
> > space.
> > Regards,
> > Libo