We have a low-volume topic (~75msgs/sec) for which we would like to have a
low propagation delay from producer to consumer.
We have 3 brokers, each with a default of 4 partitions each. for a total of
The producer is sync, without compression. There are 8 producers each
producing 1/8 of the traffic.
We are using the high-level java consumer, with 4 threads consuming the
We are wrapping the message with a custom Encoder/Decoder and record
currentTimeMillis() on the sender, and do the same in the receiver, then
record the propagation delay. All hosts are time synced with ntp.
With the settings on the broker for flush messages and flush interval
(unset, defaults to 500 msgs and 3000ms) the overall 95th percentile for
propagation is 2,500ms.
When we adjust the topic flush interval to 20ms, the 95th percentile drops
When we adjust the consumers "fetcher.backoff.ms" to 10, the 95th
percentile drops to about 970ms.
We would like this to be sub-500ms.
We could run with less partitions and/or more consumer threads.
Anything glaring about this config? anything we're missing?