Here is a comment from Guozhong on this issue. He posted it on the
compression byte-copying issue, but it is really about not needing to do
compression. His suggestion is interesting though it ends up pushing more
complexity into consumers.
Guozhang Wang commented on KAFKA-527:
One alternative approach would be like this:
Currently in the compression code (ByteBufferMessageSet.create), for each
message we write 1) the incrementing logical offset in LONG, 2) the message
byte size in INT, and 3) the message payload.
The idea is that since the logical offset is just incrementing, hence with
a compressed message, as long as we know the offset of the first message,
we would know the offset of the rest messages without even reading the
So we can ignore reading the offset of each message inside of the
compressed message but only the offset of the wrapper message which is the
offset of the last message + 1, and then in assignOffsets just modify the
offset of the wrapper message. Another change would be at the consumer
side, the iterator would need to be smart of interpreting the offsets of
messages while deep-iterating the compressed message.
As Jay pointed out, this method would not work with log compaction since it
would break the assumption that offsets increments continuously. Two
workarounds of this issue:
1) In log compaction, instead of deleting the to-be-deleted-message just
setting its payload to null but keep its header and hence keeping its slot
in the incrementing offset.
2) During the compression process, instead of writing the absolute value of
the logical offset of messages, write the deltas of their offset compared
with the offset of the wrapper message. So -1 would mean continuously
decrementing from the wrapper message offset, and -2/3/... would be
skipping holes in side the compressed message.
On Fri, Aug 2, 2013 at 10:19 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: