(I've changed the subject of this thread (was "Preparing for the 0.8 final

So, I'm not sure that my issue is exactly the same as that mentioned in the

Anyway, in looking at the MaxLag values for several consumers (not all
consuming the same topics), it looks like there was a strange interaction.

First, a new consumer app was started, which initially had a log of MaxLag
for a period of about a day.  This lag stayed constant at about 175M, then
once this new consumer caught up, it's lag dropped down quickly toward zero.

What's interesting, is that at that time that the new consumer seemed to
have caught up, 4 other consumers which had undetectable lag, suddenly had
their lag values shoot up, taking between 2 to 4 days to recover.  What's
interesting, is that these other consumers were mostly consuming different
topics than the new consumer that triggered this lag swap.

The 1 consumer that took the longest to recover (4 days) was consuming the
same topic as the new one, and the lag value stayed roughly constant at
about 175M messages, until it recovered.

All these consumers are using their own groupid's.

Does any of this make sense?

One thing, is that I expect the new consumer which started the issues was a
super fast consumer (no downstream IO connections, etc.).

Does any of this make sense?  Could one consumer affect other consumers of
unrelated topics?  Does it make any sense that one consumer would catch up,
only to cause several other consumers to suddenly fall behind?

Thanks :)

On Sun, Oct 20, 2013 at 12:18 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB