Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> 0.8 behavior change: consumer "re-receives" last batch of messages in a topic?


+
Hargett, Phil 2013-03-13, 19:00
+
Chris Curtin 2013-03-13, 19:14
+
Neha Narkhede 2013-03-13, 19:15
Copy link to this message
-
Re: 0.8 behavior change: consumer "re-receives" last batch of messages in a topic?
Thanks Neha,

I some how missed the 'nextOffset' when converting my logic. I'm assuming
the +1 trick works by chance and I shouldn't assume the next Offset is +1?

(It is minor for me to fix, I'm just curious where +1 might not work.)

Thanks,

Chris
On Wed, Mar 13, 2013 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> In 0.8, the iterator over the data returned in the FetchResponse is over
> MessageAndOffset. This class has a nextOffset() API, which is the offset of
> the next message in the message set. So, the nextOffset() value returned on
> the last message in the message should be used as the fetch offset in the
> following fetch() call to Kafka.
>
> Thanks,
> Neha
>
>
> On Wed, Mar 13, 2013 at 11:49 AM, Hargett, Phil <
> [EMAIL PROTECTED]> wrote:
>
> > I have 2 consumers in our scenario, reading from different brokers. Each
> > broker is running standalone, although each have their own dedicated
> > zookeeper instance for bookkeeping.
> >
> > After switching from 0.7.2, I noticed that both consumers exhibited high
> > CPU usage. I am not yet exploiting any zookeeper knowledge in my consumer
> > code; I am just making calls to the SimpleConsumer in the java API,
> passing
> > the host and port of my broker.
> >
> > In 0.7.2, I kept the last offset from messages received via a fetch, and
> > used that as the offset passed into the fetch method when receiving the
> > next message set.
> >
> > With 0.8, I had to add a check to drop fetched messages when the
> message's
> > offset was less than my own offset, based on the last message I saw. If I
> > didn't make that change, it seemed like the last 200 or so messages in my
> > topic  (probably matches a magic batch size configured somewhere in all
> of
> > this code) were continually refetched.
> >
> > In this scenario, my topic was no longer accumulating messages, as I had
> > turned off the producer, so I was expecting the fetches to eventually
> > either block, return an empty message set, or fail (not sure of semantics
> > of fetch). Continually receiving the last "batch" of messages at the end
> of
> > the topic was not a semantic I expected.
> >
> > Is this an intended change in behavior—or do I need to write better
> > consumer code?
> >
> > Guidance, please.
> >
> > :)
>

 
+
Neha Narkhede 2013-03-13, 19:58
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB