|
|
-
Re: How do you keep track of offset in a partition
Neha Narkhede 2013-01-29, 01:27
Jamie,
You need to use the getOffsetsBefore() API to get the earliest/latest offset available on the broker for a particular partition.
Thanks, Neha On Mon, Jan 28, 2013 at 5:05 PM, Jamie Wang <[EMAIL PROTECTED]> wrote:
> Hi, > > We are using 0.72 version of Kafka on Windows. I am wondering what is the > right way to fetch data and keep track of offset in a partition. For > example, I am currently assuming the first message the producer sent to the > broker is at offset 0. So far it seems working. Is this correct assumption? > > Let' say 2 days later, the first 100 messages on the broker is discarded > because it passed retention.hours set in the config file. Now what is the > offset I should use to retrieve the first message in the partition? And > let's also say the offset I had for the 80th message is now not valid. > What is the right way to get the correct offset to fetch in the consumer? > > What is the purpose of the api for getting a list of valid offsets for all > segments in a partition? > > Thanks in advance for your help. > > Jamie >
-
Re: How do you keep track of offset in a partition
S Ahmed 2013-01-29, 03:03
Once you have an offset, is it possible to know how many messages there are from that point to the end? (or least for the particular topic partition that you are requested data from?).
The idea is to get an idea how far behind the consumers are from the # of messages coming in etc.
I'm guessing the broker's dont' really know how many messages they are currently storing? Or is that what the index is for? On Mon, Jan 28, 2013 at 8:27 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:
> Jamie, > > You need to use the getOffsetsBefore() API to get the earliest/latest > offset available on the broker for a particular partition. > > Thanks, > Neha > > > On Mon, Jan 28, 2013 at 5:05 PM, Jamie Wang <[EMAIL PROTECTED]> > wrote: > > > Hi, > > > > We are using 0.72 version of Kafka on Windows. I am wondering what is the > > right way to fetch data and keep track of offset in a partition. For > > example, I am currently assuming the first message the producer sent to > the > > broker is at offset 0. So far it seems working. Is this correct > assumption? > > > > Let' say 2 days later, the first 100 messages on the broker is discarded > > because it passed retention.hours set in the config file. Now what is the > > offset I should use to retrieve the first message in the partition? And > > let's also say the offset I had for the 80th message is now not valid. > > What is the right way to get the correct offset to fetch in the consumer? > > > > What is the purpose of the api for getting a list of valid offsets for > all > > segments in a partition? > > > > Thanks in advance for your help. > > > > Jamie > > >
-
Re: How do you keep track of offset in a partition
Tom Brown 2013-01-29, 03:09
Since offsets in Kafka 0.7x are just byte counts, you cannot know the number of messages remaining to be processed (subtract your consumers offsets from each partitions end offset). However, you can know the number of bytes remaining. Knowing the average message size, you can use that to make a rough guess as to how many messages are remaining.
--Tom
On Mon, Jan 28, 2013 at 8:03 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > Once you have an offset, is it possible to know how many messages there are > from that point to the end? (or least for the particular topic partition > that you are requested data from?). > > The idea is to get an idea how far behind the consumers are from the # of > messages coming in etc. > > I'm guessing the broker's dont' really know how many messages they are > currently storing? Or is that what the index is for? > > > > > On Mon, Jan 28, 2013 at 8:27 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote: > >> Jamie, >> >> You need to use the getOffsetsBefore() API to get the earliest/latest >> offset available on the broker for a particular partition. >> >> Thanks, >> Neha >> >> >> On Mon, Jan 28, 2013 at 5:05 PM, Jamie Wang <[EMAIL PROTECTED]> >> wrote: >> >> > Hi, >> > >> > We are using 0.72 version of Kafka on Windows. I am wondering what is the >> > right way to fetch data and keep track of offset in a partition. For >> > example, I am currently assuming the first message the producer sent to >> the >> > broker is at offset 0. So far it seems working. Is this correct >> assumption? >> > >> > Let' say 2 days later, the first 100 messages on the broker is discarded >> > because it passed retention.hours set in the config file. Now what is the >> > offset I should use to retrieve the first message in the partition? And >> > let's also say the offset I had for the 80th message is now not valid. >> > What is the right way to get the correct offset to fetch in the consumer? >> > >> > What is the purpose of the api for getting a list of valid offsets for >> all >> > segments in a partition? >> > >> > Thanks in advance for your help. >> > >> > Jamie >> > >>
|
|