Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> 0.8 behavior change: consumer "re-receives" last batch of messages in a topic?


+
Hargett, Phil 2013-03-13, 19:00
+
Chris Curtin 2013-03-13, 19:14
+
Neha Narkhede 2013-03-13, 19:15
+
Chris Curtin 2013-03-13, 19:42
Copy link to this message
-
Re: 0.8 behavior change: consumer "re-receives" last batch of messages in a topic?
+1 works every time. We provided a nextOffset() API for convenience just in
case this changes in the future.

Thanks,
Neha
On Wed, Mar 13, 2013 at 12:42 PM, Chris Curtin <[EMAIL PROTECTED]>wrote:

> Thanks Neha,
>
> I some how missed the 'nextOffset' when converting my logic. I'm assuming
> the +1 trick works by chance and I shouldn't assume the next Offset is +1?
>
> (It is minor for me to fix, I'm just curious where +1 might not work.)
>
> Thanks,
>
> Chris
>
>
> On Wed, Mar 13, 2013 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > In 0.8, the iterator over the data returned in the FetchResponse is over
> > MessageAndOffset. This class has a nextOffset() API, which is the offset
> of
> > the next message in the message set. So, the nextOffset() value returned
> on
> > the last message in the message should be used as the fetch offset in the
> > following fetch() call to Kafka.
> >
> > Thanks,
> > Neha
> >
> >
> > On Wed, Mar 13, 2013 at 11:49 AM, Hargett, Phil <
> > [EMAIL PROTECTED]> wrote:
> >
> > > I have 2 consumers in our scenario, reading from different brokers.
> Each
> > > broker is running standalone, although each have their own dedicated
> > > zookeeper instance for bookkeeping.
> > >
> > > After switching from 0.7.2, I noticed that both consumers exhibited
> high
> > > CPU usage. I am not yet exploiting any zookeeper knowledge in my
> consumer
> > > code; I am just making calls to the SimpleConsumer in the java API,
> > passing
> > > the host and port of my broker.
> > >
> > > In 0.7.2, I kept the last offset from messages received via a fetch,
> and
> > > used that as the offset passed into the fetch method when receiving the
> > > next message set.
> > >
> > > With 0.8, I had to add a check to drop fetched messages when the
> > message's
> > > offset was less than my own offset, based on the last message I saw.
> If I
> > > didn't make that change, it seemed like the last 200 or so messages in
> my
> > > topic  (probably matches a magic batch size configured somewhere in all
> > of
> > > this code) were continually refetched.
> > >
> > > In this scenario, my topic was no longer accumulating messages, as I
> had
> > > turned off the producer, so I was expecting the fetches to eventually
> > > either block, return an empty message set, or fail (not sure of
> semantics
> > > of fetch). Continually receiving the last "batch" of messages at the
> end
> > of
> > > the topic was not a semantic I expected.
> > >
> > > Is this an intended change in behavior—or do I need to write better
> > > consumer code?
> > >
> > > Guidance, please.
> > >
> > > :)
> >
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB