Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # dev - Troubles with compressed message set


Copy link to this message
-
Re: Troubles with compressed message set
Jay Kreps 2013-01-30, 21:17
Ah, yes, the decision since we were making a breaking change to the
protocol to just reset all the versions to 0. But if you are trying to
handle both cases from the same code that will be hard.

-Jay
On Wed, Jan 30, 2013 at 12:59 PM, David Arthur <[EMAIL PROTECTED]> wrote:

> Jay,
>
> Figured it out.
>
> In Message.scala, CurrentMagicValue is set to 0; should be 2. This was
> causing my client to attempt to decode it as a v0 message. Changing the
> value to 2 solved my problem. Seems like a trivial change, so I'll let you
> decided if you want a Jira or not.
>
> From my previous example, https://gist.github.com/**
> bf134906f6559b0f54ad#file-**gistfile1-txt-L71<https://gist.github.com/bf134906f6559b0f54ad#file-gistfile1-txt-L71>should be 2
>
> -David
>
> messages to set their offsets
>
> On 1/30/13 11:18 AM, Jay Kreps wrote:
>
>> Hi David,
>>
>> MessageSets aren't size delimited because that format is shared with
>> producer/consumer and the kafka log itself. The log itself is just a big
>> sequence of messages and any subset that begins and ends at valid message
>> boundaries is a valid message set. This means that message sets are size
>> prefixed only as part of the request/response. Not sure if that is what
>> you
>> are asking?
>>
>> It's hard to see the cause of the error you describe. I don't suppose you
>> could send me a snapshot of your client to reproduce locally?
>>
>> -Jay
>>
>>
>>
>>
>> On Wed, Jan 30, 2013 at 7:26 AM, David Arthur <[EMAIL PROTECTED]> wrote:
>>
>>  I'm working on a client and I'm running into difficulties with compressed
>>> message sets. I am able to produce them fine, but when I go to fetch
>>> them,
>>> things seem strange.
>>>
>>> I am sending a message who's value is a compressed message set. The inner
>>> message set contains a single message. Specifically what looks weird is
>>> that the key of the top message looks corrupt. Here is a trace of my
>>> payloads:
>>>
>>> https://gist.github.com/****bf134906f6559b0f54ad<https://gist.github.com/**bf134906f6559b0f54ad>
>>> <https://**gist.github.com/**bf134906f6559b0f54ad<https://gist.github.com/bf134906f6559b0f54ad>
>>> >
>>>
>>>
>>> See the "???" down in the FetchResponse for what I mean. Also the magic
>>> byte and attributes are wrong
>>>
>>> The data in the Kafka log for this partition matches what I get back for
>>> the MessageSet in the FetchResponse:
>>>
>>> \x00\x00\x00\x00\x00\x00\x00\****x00\x00\x00\x00;\xf5#\xc2N\**
>>> x00\x01\xff\xff\xff\xff\x00\****x00\x00-\x1f\x8b\x08\x00\x00\****
>>> x00\x00\x00\x00\x00c`\x80\x03\****x89P\xf7\xef\xccL
>>> \x16sZ~>\x90bw\x8f\xf2\x0c\****x08HM\x01\x00\xc5\x93\xd3<$\****
>>> x00\x00\x00
>>>
>>>
>>> Another bit of weirdness here is how MessageSets are encoded. Everywhere
>>> else in the API, we prefix a repeated element with a size of int32. When
>>> encoding MessageSets, if I follow this convention, Kafka rejects the
>>> produce request - if I exclude that int32 it works fine. I don't know if
>>> this was intentional or not, but it is somewhat annoying and
>>> inconsistent.
>>> When decoding MessageSets, I have to do-while instead of iterate a known
>>> number of times.
>>>
>>> Cheers
>>> -David
>>>
>>>
>