Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # dev >> Consumer re-design proposal

Copy link to this message
Re: Consumer re-design proposal
Throwing a +1 on "Allow the consumer to reset its offset to some arbitrary value, and then write that offset into ZK".

We're currently running into a scenario where we would like to have 100% reliability, and we're losing a few messages when a connection is broken, but there were still a few messages in the OS TCP buffers. So, we're planning on shifting the ZK offset by a few seconds "back in time" if we detect a broker has gone down, to make sure all the messages will be actually delivered to the end consumer when that broker comes back up, even if there's a small amount of overlapping messages.


On Jun 14, 2012, at 2:39 PM, Evan Chan wrote:

> I would like to throw in a couple use cases:
>   - Allow the new consumer to reset its offset to either the current
>   largest or smallest.  This would be a great way to restart a process that
>   has fallen behind.  The only way I know how to do this today, with the
>   high-level consumer, is to delete the ZK nodes manually and restart the
>   consumer.
>   - Allow the consumer to reset its offset to some arbitrary value, and
>   then write that offset into ZK.    Kind of like the first case, but would
>   make rewinding/replays much easier.
> Modularity (the ability to layer the ZK infrastructure on top of the simple
> interface) would be great.
> thanks,
> Evan
> On Tue, Jun 12, 2012 at 9:59 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>> This is a great summary Neha. It would be good to get people's feedback on
>> this since we don't want to keep breaking api and
>> protocol compatibility here, so the hope is to really get it right this
>> time now that we have really seen all the use cases and live with the
>> output for a while. I think the consumer design is a pretty hard protocol
>> and API design problem, so its fun to think about.
>> If I were to summarize Neha's requirements list, I think there are three
>> high-level goals:
>>  1. Simplify the consumer protocol to enable ease of development of
>>  consumer clients in other languages
>>  2. Try to replace the "simple consumer" and "high level consumer" with a
>>  single, general interface that has all the advantages of both.
>>  3. Support a bunch of use cases that either we didn't think of, or that
>>  weren't possible in the partitioning model of the pre-0.8 code base.
>> -Jay
>> On Mon, Jun 11, 2012 at 4:52 PM, Neha Narkhede <[EMAIL PROTECTED]
>>> wrote:
>>> Hi,
>>> Over the past few months, we've received quite a lot of feedback on the
>>> consumer side features and design. Some of them are improvements to the
>>> current consumer design and some are simply new feature/API requests. I
>>> have attempted to write up the requirements that I've heard on this wiki
>> -
>> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design
>>> This would involve some significant changes to the consumer APIs, so we
>>> would like to collect feedback on the proposal from our community. Since
>>> the list of changes is not small, we would like to understand if some
>>> features are preferred over others, and more importantly, if some
>> features
>>> are not required at all.
>>> Since some part of this proposal is experimental and the consumer side
>>> changes are non-trivial, we would like this initiative to not interfere
>>> with the forthcoming replication release. However, it will be good to
>> have
>>> people from the community give this some thought and help out with the
>>> JIRAs if interested. One way of managing this project could be creating a
>>> separate branch from the kafka trunk and continue development on it. Once
>>> it is ready and in good shape, we can think about cutting another release
>>> (after 0.8) for the releasing the new consumer API. Do people have
>>> preferences/concerns regarding creating a separate branch for this
>> project
>>> ?
>>> Please feel free to start a discussion on this JIRA -
>>> https