Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Re: Offset commit api

Jun Rao 2012-12-18, 16:06
Jay Kreps 2012-12-18, 16:23
Jay Kreps 2012-12-20, 21:04
Copy link to this message
Re: Offset commit api
No particular objection, though in order to support atomic writes of
(offset, metadata), we will need to define a protocol for the ZooKeeper
payloads. Something like:

   OffsetPayload => Offset [Metadata]
   Metadata => length prefixed string

should suffice. Otherwise we would have to rely on the multi-write
mechanism to keep parallel znodes in sync (I generally don't like things
like this).

+1 for limiting the size (1kb sounds reasonable)

On 12/20/12 4:03 PM, Jay Kreps wrote:
> Okay I did some assessment of use cases we have which aren't using the
> default offset storage API and came up with one generalization. I would
> like to propose--add a generic metadata field to the offset api on a
> per-partition basis. So that would leave us with the following:
> OffsetCommitRequest => ConsumerGroup [TopicName [Partition Offset Metadata]]
> OffsetFetchResponse => [TopicName [Partition Offset Metadata ErrorCode]]
>    Metadata => string
> If you want to store a reference to any associated state (say an HDFS file
> name) so that if the consumption fails over the new consumer can start up
> with the same state, this would be a place to store that. It would not be
> intended to support large stuff (we could enforce a 1k limit or something,
> just something small or a reference on where to find the state (say a file
> name).
> Objections?
> -Jay
> On Mon, Dec 17, 2012 at 10:45 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>> Hey Guys,
>> David has made a bunch of progress on the offset commit api implementation.
>> Since this is a public API it would be good to do as much thinking
>> up-front as possible to minimize future iterations.
>> It would be great if folks could do the following:
>> 1. Read the wiki here:
>> https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management
>> 2. Check out the code David wrote here:
>> https://issues.apache.org/jira/browse/KAFKA-657
>> In particular our hope is that this API can act as the first step in
>> scaling the way we store offsets (ZK is not really very appropriate for
>> this). This of course requires having some plan in mind for offset storage.
>> I have written (and then after getting some initial feedback, rewritten) a
>> section in the above wiki on how this might work.
>> If no one says anything I will be taking a slightly modified patch that
>> adds this functionality on trunk as soon as David gets in a few minor
>> tweaks.
>> -Jay
Jay Kreps 2012-12-20, 22:04
Jay Kreps 2012-12-20, 22:05
David Arthur 2012-12-20, 22:09
Milind Parikh 2012-12-20, 22:15
Jay Kreps 2012-12-20, 22:18