Re: use case with high rate of duplicate messages
Batch processing will increase the throughput but also increase latency,
how large latency your real-time processing can tolerate?
One thing you could try is to use the keyed messages, with key as the md5
hash of your message. Kafka has a deduplication mechanism on the brokers
that dedup messages with the same key. All you need to do is setting the
dedup frequency appropriately for your use case.
On Tue, Oct 1, 2013 at 8:19 AM, S Ahmed <[EMAIL PROTECTED]> wrote: