Jay Kreps 2013-07-26, 19:00
Jason Rosenberg 2013-07-26, 21:46
Chris Hogue 2013-08-02, 19:29
Great comments, answers inline!
On Fri, Aug 2, 2013 at 12:28 PM, Chris Hogue <[EMAIL PROTECTED]> wrote:
> These sounds like great steps. A couple of votes and questions:
> 1. Moving serialization out and basing it all off of byte for key and
> payload makes sense. Echoing a response below, we've ended up doing that in
> some cases anyway, and the others do a trivial transform to bytes with an
> 2. On the single producer thread, we're actually suffering a bit from this
> in 0.8, but it's mostly because compression and the blocking send happen on
> this thread. In 0.7 since there was a thread-per-broker, a nice side-effect
> was that compression and the blocking could "go wide", at least to the
> number of brokers. If compression is moved out and the sends are now
> non-blocking then this sounds like a nice improvement.
I think even in 0.7 there was only one thread, right?
> 3. The wiki talks about static partition assignment for consumers. Just
> adding a vote for that as we're currently working through how to do that
> ourselves with the 0.8 consumer.
Cool, yeah currently you must use the simple consumer to get that which is
> 4. I'm curious how compression would interact with the new ByteBuffer
> buffering you've described. If I'm following correctly you've said that
> rather than queueing objects you'd end up doing in-place writes to the
> pre-allocated ByteBuffer. Presumably this means the compression has already
> happened on the user thread. But if there's no batching/buffering except in
> the ByteBuffer, is there somewhere that multiple messages will be
> compressed together (since it should result in better compression)? Maybe
> there's still batching before this and I read too much into it?
I'm not 100% sure, but I believe the compression can still be done inline.
The compression algorithm will buffer a bit, of course. What we currently
do though is write out the full data uncompressed and then compress it.
This is pretty inefficient. Basically we are using Java's OutputStream apis
for compression but we need to be using the lower-level array oriented
algorithms like (Deflater). I haven't tried this but my assumption is that
we can compress the messages as they arrive into the destination buffer
instead of the current approach.
> 5. I don't know if this is quite the right place to discuss it, but since
> the producer has some involvement I'll throw it out there. The un-compress,
> assign offsets, re-compress that happens on the broker with the built-in
> compression API is a significant bottleneck that we're really trying to
> avoid. As noted in another thread, we saw a throughput increase on the
> order of 3x when we pre-batched and compressed the payloads before sending
> it to the producer with 0.8.
Yes, it is a bummer. We think ultimately this does make sense though, for
two reasons beyond offsets:
1. You have to validate the integrity of the data the client has sent to
you or else one bad or buggy client can screw up all consumers.
2. The compression of the log should not be tied to the compression used by
individual producers. We haven't made this change yet, but it is an easy
one. The problem today is that if your producers send a variety of
compression types your consumers need to handle the union of all types and
you have no guarantee over what types producers may send in the future.
Instead we think these should be decoupled. The topic should have a
compression type property and that should be totally decoupled from the
compression type the producer uses. In many cases there is no real need for
the producer to use compression at all as the real thing you want to
optimize is later inter-datacenter transfers no the network send to the
local broker so the producer can just send uncompressed and have the broker
control the compression type.
The performance really has two causes though:
1. GZIP is super slow, especially java's implementation. But snappy, for
example, is actually quite fast. We should be able to do snappy at network
speeds according to the perf data I have seen, but...
2. ...our current compression code is kind of inefficient due to all the
copying and traversal, due to the reasons cited above.
So in other words I think we can make this a bit better but it probably
won't go away. How do you feel about snappy?
We can't really do this because we are multi-writer so any offset we give
the client would potentially be used by another producer and then be
invalid or non-sequential.
Chris Hogue 2013-08-02, 23:56
Jay Kreps 2013-08-03, 02:42
Chris Hogue 2013-08-03, 13:50
Tommy Messbauer 2013-07-29, 15:28
Jay Kreps 2013-08-02, 19:50