Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # dev >> Re: Offset commit api


Copy link to this message
-
Re: Offset commit api
No particular objection, though in order to support atomic writes of
(offset, metadata), we will need to define a protocol for the ZooKeeper
payloads. Something like:

   OffsetPayload => Offset [Metadata]
   Metadata => length prefixed string

should suffice. Otherwise we would have to rely on the multi-write
mechanism to keep parallel znodes in sync (I generally don't like things
like this).

+1 for limiting the size (1kb sounds reasonable)

On 12/20/12 4:03 PM, Jay Kreps wrote:
> Okay I did some assessment of use cases we have which aren't using the
> default offset storage API and came up with one generalization. I would
> like to propose--add a generic metadata field to the offset api on a
> per-partition basis. So that would leave us with the following:
>
> OffsetCommitRequest => ConsumerGroup [TopicName [Partition Offset Metadata]]
>
> OffsetFetchResponse => [TopicName [Partition Offset Metadata ErrorCode]]
>
>    Metadata => string
>
> If you want to store a reference to any associated state (say an HDFS file
> name) so that if the consumption fails over the new consumer can start up
> with the same state, this would be a place to store that. It would not be
> intended to support large stuff (we could enforce a 1k limit or something,
> just something small or a reference on where to find the state (say a file
> name).
>
> Objections?
>
> -Jay
>
>
> On Mon, Dec 17, 2012 at 10:45 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
>> Hey Guys,
>>
>> David has made a bunch of progress on the offset commit api implementation.
>>
>> Since this is a public API it would be good to do as much thinking
>> up-front as possible to minimize future iterations.
>>
>> It would be great if folks could do the following:
>> 1. Read the wiki here:
>> https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management
>> 2. Check out the code David wrote here:
>> https://issues.apache.org/jira/browse/KAFKA-657
>>
>> In particular our hope is that this API can act as the first step in
>> scaling the way we store offsets (ZK is not really very appropriate for
>> this). This of course requires having some plan in mind for offset storage.
>> I have written (and then after getting some initial feedback, rewritten) a
>> section in the above wiki on how this might work.
>>
>> If no one says anything I will be taking a slightly modified patch that
>> adds this functionality on trunk as soon as David gets in a few minor
>> tweaks.
>>
>> -Jay
>>