Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> dumb question about offsets

Paul Garner 2012-11-22, 15:25
Neha Narkhede 2012-11-22, 15:33
Paul Garner 2012-11-22, 15:37
Neha Narkhede 2012-11-22, 18:30
Copy link to this message
Re: dumb question about offsets
On Thu, Nov 22, 2012 at 07:33:31AM -0800, Neha Narkhede wrote:
> Yes, in Kafka 0.7, the offset is the byte position of the message in the
> log for the topic partition. In Kafka 0.8, each message is assigned a
> monotonically increasing, contiguous sequence number per partition,
> starting with 1. So each message is addressable using this sequence number
> instead of the byte position.

This is interesting. We at Loggly liked the offset, and thought it was an elegant idea (as explained on the Kafka design page). Are you *replacing* the offset, or will the sequence number be another way to reference a message?

And why the change? Perhaps there is a JIRA ticket explaining it in more detail.


> Also, the offset keeps increasing over the lifetime of a cluster, even if
> Kafka deletes older log segments.
> Thanks,
> Neha
> On Thursday, November 22, 2012, Paul Garner wrote:
> > from what I read, the message offset is the byte position of the message in
> > the log file that Kafka writes to
> >
> > the logs are rotated and eventually deleted by Kafka
> >
> > ...does this mean the message offset periodically goes back to start at
> > zero again? or the offset keeps increasing for the life of the cluster as
> > if it was a single big file back to the beginning of time?
> >

Philip O'Toole
Senior Developer
Loggly, Inc.
San Francisco, CA
Jay Kreps 2012-11-22, 21:54