-Re: 0.8 behavior change: consumer "re-receives" last batch of messages in a topic?
In 0.8, the iterator over the data returned in the FetchResponse is over
MessageAndOffset. This class has a nextOffset() API, which is the offset of
the next message in the message set. So, the nextOffset() value returned on
the last message in the message should be used as the fetch offset in the
following fetch() call to Kafka.
On Wed, Mar 13, 2013 at 11:49 AM, Hargett, Phil <
[EMAIL PROTECTED]> wrote:
> I have 2 consumers in our scenario, reading from different brokers. Each
> broker is running standalone, although each have their own dedicated
> zookeeper instance for bookkeeping.
> After switching from 0.7.2, I noticed that both consumers exhibited high
> CPU usage. I am not yet exploiting any zookeeper knowledge in my consumer
> code; I am just making calls to the SimpleConsumer in the java API, passing
> the host and port of my broker.
> In 0.7.2, I kept the last offset from messages received via a fetch, and
> used that as the offset passed into the fetch method when receiving the
> next message set.
> With 0.8, I had to add a check to drop fetched messages when the message's
> offset was less than my own offset, based on the last message I saw. If I
> didn't make that change, it seemed like the last 200 or so messages in my
> topic (probably matches a magic batch size configured somewhere in all of
> this code) were continually refetched.
> In this scenario, my topic was no longer accumulating messages, as I had
> turned off the producer, so I was expecting the fetches to eventually
> either block, return an empty message set, or fail (not sure of semantics
> of fetch). Continually receiving the last "batch" of messages at the end of
> the topic was not a semantic I expected.
> Is this an intended change in behavior—or do I need to write better
> consumer code?
> Guidance, please.