|
|
-
Re: Troubles with compressed message setDavid Arthur 2013-01-30, 21:07
On 1/30/13 11:18 AM, Jay Kreps wrote: > Hi David, > > MessageSets aren't size delimited because that format is shared with > producer/consumer and the kafka log itself. The log itself is just a big > sequence of messages and any subset that begins and ends at valid message > boundaries is a valid message set. This means that message sets are size > prefixed only as part of the request/response. Not sure if that is what you > are asking? Other repeated elements in the protocol are preceded by a int32 to indicate the number of repeated elements. E.g., topicCount in https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/api/ProducerRequest.scala#L38 MessageSet does not do this, it just starts writing out the messages https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L35. To be more like the rest of the protocol, it would need to first do something like: buffer.putInt(messages.size) It's not really a big deal either way, but I thought it important to point out since it's different than the rest of the protocol and it tripped me up. -David > > It's hard to see the cause of the error you describe. I don't suppose you > could send me a snapshot of your client to reproduce locally? > > -Jay > > > > > On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'm working on a client and I'm running into difficulties with compressed >> message sets. I am able to produce them fine, but when I go to fetch them, >> things seem strange. >> >> I am sending a message who's value is a compressed message set. The inner >> message set contains a single message. Specifically what looks weird is >> that the key of the top message looks corrupt. Here is a trace of my >> payloads: >> >> https://gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> >> >> See the "???" down in the FetchResponse for what I mean. Also the magic >> byte and attributes are wrong >> >> The data in the Kafka log for this partition matches what I get back for >> the MessageSet in the FetchResponse: >> >> \x00\x00\x00\x00\x00\x00\x00\**x00\x00\x00\x00;\xf5#\xc2N\** >> x00\x01\xff\xff\xff\xff\x00\**x00\x00-\x1f\x8b\x08\x00\x00\** >> x00\x00\x00\x00\x00c`\x80\x03\**x89P\xf7\xef\xccL >> \x16sZ~>\x90bw\x8f\xf2\x0c\**x08HM\x01\x00\xc5\x93\xd3<$\**x00\x00\x00 >> >> Another bit of weirdness here is how MessageSets are encoded. Everywhere >> else in the API, we prefix a repeated element with a size of int32. When >> encoding MessageSets, if I follow this convention, Kafka rejects the >> produce request - if I exclude that int32 it works fine. I don't know if >> this was intentional or not, but it is somewhat annoying and inconsistent. >> When decoding MessageSets, I have to do-while instead of iterate a known >> number of times. >> >> Cheers >> -David >> |