Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Re: Consumer re-design proposal


Copy link to this message
-
Re: Consumer re-design proposal
Joel Koshy 2012-06-18, 20:41
That's true - I think that's one of the major motivations of the consumer
re-design. Right now, the consumer implementation is very thick which makes
it difficult to maintain correct implementations across multiple languages.
It will be much easier to implement a consumer with the thinner logic - and
as you pointed out, many languages have pretty good bindings with native C
libraries so technically we would go pretty far with just a JVM and native
(C) implementation of the consumer logic.

Joel

On Mon, Jun 18, 2012 at 11:40 AM, Sybrandy, Casey <
[EMAIL PROTECTED]> wrote:

> Would porting the consumer/producer code to C be a good idea?  I say this
> because at least with most languages I know of, leveraging a C library is
> pretty easy.  This way, you would have to maintain only the C library and
> others can make/maintain wrappers for their languages.  Having to port to
> other languages is going to cause you to have a significant amount of
> maintenance if you change the protocol in the future.
>
> ________________________________________
> From: Neha Narkhede [[EMAIL PROTECTED]]
> Sent: Thursday, June 14, 2012 5:53 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: Consumer re-design proposal
>
> Thanks for the feedback ! I moved it to
> https://issues.apache.org/jira/browse/KAFKA-364, so that we can keep track
> of these.
>
> -Neha
>
> On Thu, Jun 14, 2012 at 2:45 PM, Marcos Juarez <[EMAIL PROTECTED]> wrote:
>
> > Throwing a +1 on "Allow the consumer to reset its offset to some
> arbitrary
> > value, and then write that offset into ZK".
> >
> > We're currently running into a scenario where we would like to have 100%
> > reliability, and we're losing a few messages when a connection is broken,
> > but there were still a few messages in the OS TCP buffers. So, we're
> > planning on shifting the ZK offset by a few seconds "back in time" if we
> > detect a broker has gone down, to make sure all the messages will be
> > actually delivered to the end consumer when that broker comes back up,
> even
> > if there's a small amount of overlapping messages.
> >
> > Thanks,
> >
> > Marcos
> >
> >
> > On Jun 14, 2012, at 2:39 PM, Evan Chan wrote:
> >
> > > I would like to throw in a couple use cases:
> > >
> > >
> > >   - Allow the new consumer to reset its offset to either the current
> > >   largest or smallest.  This would be a great way to restart a process
> > that
> > >   has fallen behind.  The only way I know how to do this today, with
> the
> > >   high-level consumer, is to delete the ZK nodes manually and restart
> the
> > >   consumer.
> > >   - Allow the consumer to reset its offset to some arbitrary value, and
> > >   then write that offset into ZK.    Kind of like the first case, but
> > would
> > >   make rewinding/replays much easier.
> > >
> > > Modularity (the ability to layer the ZK infrastructure on top of the
> > simple
> > > interface) would be great.
> > >
> > > thanks,
> > > Evan
> > >
> > >
> > > On Tue, Jun 12, 2012 at 9:59 AM, Jay Kreps <[EMAIL PROTECTED]>
> wrote:
> > >
> > >> This is a great summary Neha. It would be good to get people's
> feedback
> > on
> > >> this since we don't want to keep breaking api and
> > >> protocol compatibility here, so the hope is to really get it right
> this
> > >> time now that we have really seen all the use cases and live with the
> > >> output for a while. I think the consumer design is a pretty hard
> > protocol
> > >> and API design problem, so its fun to think about.
> > >>
> > >> If I were to summarize Neha's requirements list, I think there are
> three
> > >> high-level goals:
> > >>
> > >>  1. Simplify the consumer protocol to enable ease of development of
> > >>  consumer clients in other languages
> > >>  2. Try to replace the "simple consumer" and "high level consumer"
> with
> > a
> > >>  single, general interface that has all the advantages of both.
> > >>  3. Support a bunch of use cases that either we didn't think of, or