Kafka, mail # user - Consume more than  produce - 2014-08-01, 08:28
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
Consume more than  produce
After a year or so I have Kafka as my streaming layer in my production, I decided it is time to audit, and to test how many events do I lose, if I lose events at all.
I discovered something interesting which I can't explain.
The producer produces less events that the consumer group consumes.
It is not much more, it is about 0.1% more events
I use the Consumer API (not the simple consumer API)
I was thinking I might had rebalancing going on in my system, but it doesn't look like that.
Did anyone see such a behaviour
In order to audit, I calculated for each event the minute it arrived, and assigned this value to the event, I used statsd do to count all events from all my producer cluster, and all consumer group cluster.
I must say that it is not a happening for every minute,
Thanks, Guy

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB