I'm still a little confused by your description of the problem. It might be
easier to understand if you listed out the exact things you have measured,
what you saw, and what you expected to see.

Since you mentioned the consumer I can give a little info on how that
works. The consumer consumes from all the partitions it owns
simultaneously. The behavior is that we interleve fetched data chunks of
messages from each partition the consumer is processing. The chunk size is
controlled by the fetch size set in the consumer. So the behavior you would
expect is that you would get a bunch of messages from one partition
followed by a bunch from another partition. The reason for doing this
instead of, say, interleving individual messages is that it is a big
performance boost--making every message an entry in a blocking queue gives
a 5x performance hit in high-throughput cases. Perhaps this interleaving is
the problem?

On Sun, Aug 25, 2013 at 10:22 AM, Ian Friedman <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB