Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: implicit default minimum retention size per partition is 4GB.


Copy link to this message
-
Re: implicit default minimum retention size per partition is 4GB.
Monitoring the lag in bytes makes sense. The only difficulty is currently,
the high watermark in the leader is represented in logical message offset,
not the byte offset. For now, you will have to do the bytes to messages
translation yourself.

As for setting replica.lag.max.messages, you can observe the max lag in the
follower and set replica.lag.max.messages to be a bit larger than that. I
am curious to know the observed max lag in your use case.

Thanks,

Jun
On Tue, Sep 10, 2013 at 6:46 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:

> Hi team,
>
> For default broker configuration, replica.lag.max.messages is 4000 and
> message.max.bytes is 1Mb.
> In the extreme case, the follower(s) could lag by 4000 messages. The
> leader must save at least
> 4000 messages to allow follower(s) to catch up. So the minimum retention
> size is 4000Mb=4Gb.
> It is better to add this to the documentation.
>
> In our case, message.max.bytes is much larger than 1Mb and
> replica.lag.max.messages is larger than
> 4000. This implicit  minimum retention size is hundreds of Gb and we have
> hundreds of partitions on
> each broker. We feel we have to use a disk array to run Kafka.
>
> Because the topics have different maximum message size, it makes more
> sense to use the size gap
> between the leader and follower(s), e.g., say the follower(s) can only lag
> behind the leader by 2Gb.
> This makes it easier to control the behavior the brokers and save disk
> space.
>
> Regards,
>
> Libo
>
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB