Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Re: How do you keep track of offset in a partition


Copy link to this message
-
Re: How do you keep track of offset in a partition
S Ahmed 2013-01-29, 03:03
Once you have an offset, is it possible to know how many messages there are
from that point to the end? (or least for the particular topic partition
that you are requested data from?).

The idea is to get an idea how far behind the consumers are from the # of
messages coming in etc.

I'm guessing the broker's dont' really know how many messages they are
currently storing?  Or is that what the index is for?
On Mon, Jan 28, 2013 at 8:27 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> Jamie,
>
> You need to use the getOffsetsBefore() API to get the earliest/latest
> offset available on the broker for a particular partition.
>
> Thanks,
> Neha
>
>
> On Mon, Jan 28, 2013 at 5:05 PM, Jamie Wang <[EMAIL PROTECTED]>
> wrote:
>
> > Hi,
> >
> > We are using 0.72 version of Kafka on Windows. I am wondering what is the
> > right way to fetch data and keep track of offset in a partition. For
> > example, I am currently assuming the first message the producer sent to
> the
> > broker is at offset 0. So far it seems working. Is this correct
> assumption?
> >
> > Let' say 2 days later, the first 100 messages on the broker is discarded
> > because it passed retention.hours set in the config file. Now what is the
> > offset I should use to retrieve the first message in the partition?  And
> > let's also say the offset I had for the 80th message is now not valid.
> > What is the right way to get the correct offset to fetch in the consumer?
> >
> > What is the purpose of the api for getting a list of valid offsets for
> all
> > segments in a partition?
> >
> > Thanks in advance for your help.
> >
> > Jamie
> >
>