Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # dev - producer rewrite


Copy link to this message
-
Re: producer rewrite
Joe Stein 2014-01-23, 20:00
awesome! +1 for checking this in as is as you suggest

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/
On Thu, Jan 23, 2014 at 2:37 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> This approach sounds reasonable to me. Since the new code will be not be
> used in the current kafka jar, we can still release 0.8.1 off trunk when
> it's ready.
>
> Thanks,
>
> Jun
>
>
> On Thu, Jan 23, 2014 at 10:23 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > Hey all,
> >
> > I have been working on a rewrite of the producer as described in the wiki
> > below and discussed in a few previous threads:
> > https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite
> >
> > My code is still has some bugs and is a bit rough in parts, but it
> > functions in the basic cases. I did some basic performance tests over
> > localhost, and the new approach has paid off quite significantly--for
> small
> > (10 byte) messages a single thread on my laptop can send over 1m
> > messages/second, and with larger messages easily maxes out the server.
> >
> > The difference between "sync" and "async" largely producer
> disappears--all
> > requests immediately return a future response which can be used to get
> the
> > behavior of either sync or async usage and we batch whenever the producer
> > is under load using a "group commit"-like approach. You can encourage
> > additional batching by incurring a small amount of latency (as before).
> >
> > Let's talk about how to integrate this code.
> >
> > This is a from-scratch rewrite of the producer code. As such it is a
> pretty
> > major change. So far I have mostly been working on my own. I'd like to
> > start getting feedback before I get too far along--no point in my
> polishing
> > things that are going to be significantly revised in review, after all.
> >
> > As such here is what I would propose:
> >
> > 1. I'll put up a preliminary patch. Since this code is a completely
> > standalone module it will not destabilize the existing server or existing
> > producer (in fact there is no change to those). I will avoid including
> > build support in this patch until we get the gradle stuff worked out so
> as
> > to not break that patch (hopefully that moves along). Let's take this
> patch
> > "as is" but with no expectation that the code is complete or that checkin
> > implies everyone agrees with every design decision. I will follow-up with
> > subsequent patches as we do reviews and discussions.
> >
> > 2. I'll send out a few higher-level topics for discussion threads. Let's
> > get to consensus on these. I think micro-reviewing minor correctness
> issues
> > won't be productive until we make higher level decisions. The topics. I'd
> > like to discuss include
> > a. The producer code:
> >      - The public API
> >      - The configurations: their names, and the general knobs we are
> >      - Client message serialization
> >      - The instrumentation to have
> >      - The blocking and batching behavior
> > b. The common code and few other cross-cutting policy things
> >      - The approach to protocol definition and request serialization
> >      - The config definition helper code
> >      - The metrics package
> >      - The project layout
> >      - The java coding style and the use of java
> >      - The approach to logging
> >
> > This is somewhat backwards, but I think it will be easier to handle
> changes
> > that fall out of these discussions against an existing code base that is
> > checked in otherwise each revision will be a brand new very large patch.
> >
> > If no objections I will toss up this code and kick off some of these
> > discussions.
> >
> > -Jay
> >
>