|
David Arthur
2013-01-30, 15:27
Jay Kreps
2013-01-30, 16:19
David Arthur
2013-01-30, 21:00
David Arthur
2013-01-30, 21:07
Jay Kreps
2013-01-30, 21:17
David Arthur
2013-01-30, 21:23
|
-
Troubles with compressed message setDavid Arthur 2013-01-30, 15:27
I'm working on a client and I'm running into difficulties with
compressed message sets. I am able to produce them fine, but when I go to fetch them, things seem strange. I am sending a message who's value is a compressed message set. The inner message set contains a single message. Specifically what looks weird is that the key of the top message looks corrupt. Here is a trace of my payloads: https://gist.github.com/bf134906f6559b0f54ad See the "???" down in the FetchResponse for what I mean. Also the magic byte and attributes are wrong The data in the Kafka log for this partition matches what I get back for the MessageSet in the FetchResponse: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00;\xf5#\xc2N\x00\x01\xff\xff\xff\xff\x00\x00\x00-\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x00c`\x80\x03\x89P\xf7\xef\xccL \x16sZ~>\x90bw\x8f\xf2\x0c\x08HM\x01\x00\xc5\x93\xd3<$\x00\x00\x00 Another bit of weirdness here is how MessageSets are encoded. Everywhere else in the API, we prefix a repeated element with a size of int32. When encoding MessageSets, if I follow this convention, Kafka rejects the produce request - if I exclude that int32 it works fine. I don't know if this was intentional or not, but it is somewhat annoying and inconsistent. When decoding MessageSets, I have to do-while instead of iterate a known number of times. Cheers -David
-
Re: Troubles with compressed message setJay Kreps 2013-01-30, 16:19
Hi David,
MessageSets aren't size delimited because that format is shared with producer/consumer and the kafka log itself. The log itself is just a big sequence of messages and any subset that begins and ends at valid message boundaries is a valid message set. This means that message sets are size prefixed only as part of the request/response. Not sure if that is what you are asking? It's hard to see the cause of the error you describe. I don't suppose you could send me a snapshot of your client to reproduce locally? -Jay On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: > I'm working on a client and I'm running into difficulties with compressed > message sets. I am able to produce them fine, but when I go to fetch them, > things seem strange. > > I am sending a message who's value is a compressed message set. The inner > message set contains a single message. Specifically what looks weird is > that the key of the top message looks corrupt. Here is a trace of my > payloads: > > https://gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> > > See the "???" down in the FetchResponse for what I mean. Also the magic > byte and attributes are wrong > > The data in the Kafka log for this partition matches what I get back for > the MessageSet in the FetchResponse: > > \x00\x00\x00\x00\x00\x00\x00\**x00\x00\x00\x00;\xf5#\xc2N\** > x00\x01\xff\xff\xff\xff\x00\**x00\x00-\x1f\x8b\x08\x00\x00\** > x00\x00\x00\x00\x00c`\x80\x03\**x89P\xf7\xef\xccL > \x16sZ~>\x90bw\x8f\xf2\x0c\**x08HM\x01\x00\xc5\x93\xd3<$\**x00\x00\x00 > > Another bit of weirdness here is how MessageSets are encoded. Everywhere > else in the API, we prefix a repeated element with a size of int32. When > encoding MessageSets, if I follow this convention, Kafka rejects the > produce request - if I exclude that int32 it works fine. I don't know if > this was intentional or not, but it is somewhat annoying and inconsistent. > When decoding MessageSets, I have to do-while instead of iterate a known > number of times. > > Cheers > -David >
-
Re: Troubles with compressed message setDavid Arthur 2013-01-30, 21:00
Jay,
Figured it out. In Message.scala, CurrentMagicValue is set to 0; should be 2. This was causing my client to attempt to decode it as a v0 message. Changing the value to 2 solved my problem. Seems like a trivial change, so I'll let you decided if you want a Jira or not. From my previous example, https://gist.github.com/bf134906f6559b0f54ad#file-gistfile1-txt-L71 should be 2 -David messages to set their offsets On 1/30/13 11:18 AM, Jay Kreps wrote: > Hi David, > > MessageSets aren't size delimited because that format is shared with > producer/consumer and the kafka log itself. The log itself is just a big > sequence of messages and any subset that begins and ends at valid message > boundaries is a valid message set. This means that message sets are size > prefixed only as part of the request/response. Not sure if that is what you > are asking? > > It's hard to see the cause of the error you describe. I don't suppose you > could send me a snapshot of your client to reproduce locally? > > -Jay > > > > > On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'm working on a client and I'm running into difficulties with compressed >> message sets. I am able to produce them fine, but when I go to fetch them, >> things seem strange. >> >> I am sending a message who's value is a compressed message set. The inner >> message set contains a single message. Specifically what looks weird is >> that the key of the top message looks corrupt. Here is a trace of my >> payloads: >> >> https://gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> >> >> See the "???" down in the FetchResponse for what I mean. Also the magic >> byte and attributes are wrong >> >> The data in the Kafka log for this partition matches what I get back for >> the MessageSet in the FetchResponse: >> >> \x00\x00\x00\x00\x00\x00\x00\**x00\x00\x00\x00;\xf5#\xc2N\** >> x00\x01\xff\xff\xff\xff\x00\**x00\x00-\x1f\x8b\x08\x00\x00\** >> x00\x00\x00\x00\x00c`\x80\x03\**x89P\xf7\xef\xccL >> \x16sZ~>\x90bw\x8f\xf2\x0c\**x08HM\x01\x00\xc5\x93\xd3<$\**x00\x00\x00 >> >> Another bit of weirdness here is how MessageSets are encoded. Everywhere >> else in the API, we prefix a repeated element with a size of int32. When >> encoding MessageSets, if I follow this convention, Kafka rejects the >> produce request - if I exclude that int32 it works fine. I don't know if >> this was intentional or not, but it is somewhat annoying and inconsistent. >> When decoding MessageSets, I have to do-while instead of iterate a known >> number of times. >> >> Cheers >> -David >>
-
Re: Troubles with compressed message setDavid Arthur 2013-01-30, 21:07
On 1/30/13 11:18 AM, Jay Kreps wrote: > Hi David, > > MessageSets aren't size delimited because that format is shared with > producer/consumer and the kafka log itself. The log itself is just a big > sequence of messages and any subset that begins and ends at valid message > boundaries is a valid message set. This means that message sets are size > prefixed only as part of the request/response. Not sure if that is what you > are asking? Other repeated elements in the protocol are preceded by a int32 to indicate the number of repeated elements. E.g., topicCount in https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/api/ProducerRequest.scala#L38 MessageSet does not do this, it just starts writing out the messages https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L35. To be more like the rest of the protocol, it would need to first do something like: buffer.putInt(messages.size) It's not really a big deal either way, but I thought it important to point out since it's different than the rest of the protocol and it tripped me up. -David > > It's hard to see the cause of the error you describe. I don't suppose you > could send me a snapshot of your client to reproduce locally? > > -Jay > > > > > On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'm working on a client and I'm running into difficulties with compressed >> message sets. I am able to produce them fine, but when I go to fetch them, >> things seem strange. >> >> I am sending a message who's value is a compressed message set. The inner >> message set contains a single message. Specifically what looks weird is >> that the key of the top message looks corrupt. Here is a trace of my >> payloads: >> >> https://gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> >> >> See the "???" down in the FetchResponse for what I mean. Also the magic >> byte and attributes are wrong >> >> The data in the Kafka log for this partition matches what I get back for >> the MessageSet in the FetchResponse: >> >> \x00\x00\x00\x00\x00\x00\x00\**x00\x00\x00\x00;\xf5#\xc2N\** >> x00\x01\xff\xff\xff\xff\x00\**x00\x00-\x1f\x8b\x08\x00\x00\** >> x00\x00\x00\x00\x00c`\x80\x03\**x89P\xf7\xef\xccL >> \x16sZ~>\x90bw\x8f\xf2\x0c\**x08HM\x01\x00\xc5\x93\xd3<$\**x00\x00\x00 >> >> Another bit of weirdness here is how MessageSets are encoded. Everywhere >> else in the API, we prefix a repeated element with a size of int32. When >> encoding MessageSets, if I follow this convention, Kafka rejects the >> produce request - if I exclude that int32 it works fine. I don't know if >> this was intentional or not, but it is somewhat annoying and inconsistent. >> When decoding MessageSets, I have to do-while instead of iterate a known >> number of times. >> >> Cheers >> -David >>
-
Re: Troubles with compressed message setJay Kreps 2013-01-30, 21:17
Ah, yes, the decision since we were making a breaking change to the
protocol to just reset all the versions to 0. But if you are trying to handle both cases from the same code that will be hard. -Jay On Wed, Jan 30, 2013 at 12:59 PM, David Arthur <[EMAIL PROTECTED]> wrote: > Jay, > > Figured it out. > > In Message.scala, CurrentMagicValue is set to 0; should be 2. This was > causing my client to attempt to decode it as a v0 message. Changing the > value to 2 solved my problem. Seems like a trivial change, so I'll let you > decided if you want a Jira or not. > > From my previous example, https://gist.github.com/** > bf134906f6559b0f54ad#file-**gistfile1-txt-L71<https://gist.github.com/bf134906f6559b0f54ad#file-gistfile1-txt-L71>should be 2 > > -David > > messages to set their offsets > > On 1/30/13 11:18 AM, Jay Kreps wrote: > >> Hi David, >> >> MessageSets aren't size delimited because that format is shared with >> producer/consumer and the kafka log itself. The log itself is just a big >> sequence of messages and any subset that begins and ends at valid message >> boundaries is a valid message set. This means that message sets are size >> prefixed only as part of the request/response. Not sure if that is what >> you >> are asking? >> >> It's hard to see the cause of the error you describe. I don't suppose you >> could send me a snapshot of your client to reproduce locally? >> >> -Jay >> >> >> >> >> On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: >> >> I'm working on a client and I'm running into difficulties with compressed >>> message sets. I am able to produce them fine, but when I go to fetch >>> them, >>> things seem strange. >>> >>> I am sending a message who's value is a compressed message set. The inner >>> message set contains a single message. Specifically what looks weird is >>> that the key of the top message looks corrupt. Here is a trace of my >>> payloads: >>> >>> https://gist.github.com/****bf134906f6559b0f54ad<https://gist.github.com/**bf134906f6559b0f54ad> >>> <https://**gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> >>> > >>> >>> >>> See the "???" down in the FetchResponse for what I mean. Also the magic >>> byte and attributes are wrong >>> >>> The data in the Kafka log for this partition matches what I get back for >>> the MessageSet in the FetchResponse: >>> >>> \x00\x00\x00\x00\x00\x00\x00\****x00\x00\x00\x00;\xf5#\xc2N\** >>> x00\x01\xff\xff\xff\xff\x00\****x00\x00-\x1f\x8b\x08\x00\x00\**** >>> x00\x00\x00\x00\x00c`\x80\x03\****x89P\xf7\xef\xccL >>> \x16sZ~>\x90bw\x8f\xf2\x0c\****x08HM\x01\x00\xc5\x93\xd3<$\**** >>> x00\x00\x00 >>> >>> >>> Another bit of weirdness here is how MessageSets are encoded. Everywhere >>> else in the API, we prefix a repeated element with a size of int32. When >>> encoding MessageSets, if I follow this convention, Kafka rejects the >>> produce request - if I exclude that int32 it works fine. I don't know if >>> this was intentional or not, but it is somewhat annoying and >>> inconsistent. >>> When decoding MessageSets, I have to do-while instead of iterate a known >>> number of times. >>> >>> Cheers >>> -David >>> >>> >
-
Re: Troubles with compressed message setDavid Arthur 2013-01-30, 21:23
On 1/30/13 4:17 PM, Jay Kreps wrote: > Ah, yes, the decision since we were making a breaking change to the > protocol to just reset all the versions to 0. Excellent, I shall purge my 0.7x compatible code at once :) > But if you are trying to > handle both cases from the same code that will be hard. > > -Jay > > > On Wed, Jan 30, 2013 at 12:59 PM, David Arthur <[EMAIL PROTECTED]> wrote: > >> Jay, >> >> Figured it out. >> >> In Message.scala, CurrentMagicValue is set to 0; should be 2. This was >> causing my client to attempt to decode it as a v0 message. Changing the >> value to 2 solved my problem. Seems like a trivial change, so I'll let you >> decided if you want a Jira or not. >> >> From my previous example, https://gist.github.com/** >> bf134906f6559b0f54ad#file-**gistfile1-txt-L71<https://gist.github.com/bf134906f6559b0f54ad#file-gistfile1-txt-L71>should be 2 >> >> -David >> >> messages to set their offsets >> >> On 1/30/13 11:18 AM, Jay Kreps wrote: >> >>> Hi David, >>> >>> MessageSets aren't size delimited because that format is shared with >>> producer/consumer and the kafka log itself. The log itself is just a big >>> sequence of messages and any subset that begins and ends at valid message >>> boundaries is a valid message set. This means that message sets are size >>> prefixed only as part of the request/response. Not sure if that is what >>> you >>> are asking? >>> >>> It's hard to see the cause of the error you describe. I don't suppose you >>> could send me a snapshot of your client to reproduce locally? >>> >>> -Jay >>> >>> >>> >>> >>> On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote: >>> >>> I'm working on a client and I'm running into difficulties with compressed >>>> message sets. I am able to produce them fine, but when I go to fetch >>>> them, >>>> things seem strange. >>>> >>>> I am sending a message who's value is a compressed message set. The inner >>>> message set contains a single message. Specifically what looks weird is >>>> that the key of the top message looks corrupt. Here is a trace of my >>>> payloads: >>>> >>>> https://gist.github.com/****bf134906f6559b0f54ad<https://gist.github.com/**bf134906f6559b0f54ad> >>>> <https://**gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad> >>>> >>>> See the "???" down in the FetchResponse for what I mean. Also the magic >>>> byte and attributes are wrong >>>> >>>> The data in the Kafka log for this partition matches what I get back for >>>> the MessageSet in the FetchResponse: >>>> >>>> \x00\x00\x00\x00\x00\x00\x00\****x00\x00\x00\x00;\xf5#\xc2N\** >>>> x00\x01\xff\xff\xff\xff\x00\****x00\x00-\x1f\x8b\x08\x00\x00\**** >>>> x00\x00\x00\x00\x00c`\x80\x03\****x89P\xf7\xef\xccL >>>> \x16sZ~>\x90bw\x8f\xf2\x0c\****x08HM\x01\x00\xc5\x93\xd3<$\**** >>>> x00\x00\x00 >>>> >>>> >>>> Another bit of weirdness here is how MessageSets are encoded. Everywhere >>>> else in the API, we prefix a repeated element with a size of int32. When >>>> encoding MessageSets, if I follow this convention, Kafka rejects the >>>> produce request - if I exclude that int32 it works fine. I don't know if >>>> this was intentional or not, but it is somewhat annoying and >>>> inconsistent. >>>> When decoding MessageSets, I have to do-while instead of iterate a known >>>> number of times. >>>> >>>> Cheers >>>> -David >>>> >>>> |