Batch processing will increase the throughput but also increase latency,
how large latency your real-time processing can tolerate?

One thing you could try is to use the keyed messages, with key as the md5
hash of your message. Kafka has a deduplication mechanism on the brokers
that dedup messages with the same key. All you need to do is setting the
dedup frequency appropriately for your use case.

On Tue, Oct 1, 2013 at 8:19 AM, S Ahmed <[EMAIL PROTECTED]> wrote:

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB