Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Do the values in the json object have to be ordered?


Copy link to this message
-
Re: Do the values in the json object have to be ordered?
Jonathan Coveney 2013-04-05, 08:22
Looks like an issue in my version of avro:
https://issues.apache.org/jira/browse/AVRO-895?attachmentSortBy=dateTime

We're using 1.5.4....I guess it's time for an upgrade. Does anyone know if
there are any backwards compatibility issues between those version?
2013/4/4 Jonathan Coveney <[EMAIL PROTECTED]>

> Yeah, I'd love to have Doug's thoughts.
>
> Short of a bug fix, to work around I guess I can provide my own decoder?
> That seems like a bit of work though. I guess I could also make a builder
> for my schemas, and then traverse the json map and build it up? I guess
> making my own decoder would be less work than that.
>
> Would appreciate any thoughts on a good workaround, or if I should just
> try to patch it (assuming it is a bug) and backport the fix (something
> which I would like to avoid, but will do if I have to).
>
>
> 2013/4/4 Philip Zeyliger <[EMAIL PROTECTED]>
>
>> It smells like a bug to me.  Doug typically has more insight here about
>> the Java implementation.  I'm mainly a user of the Specific* hierarchy and
>> not the Generic one.
>>
>> -- Philip
>>
>>
>> On Thu, Apr 4, 2013 at 10:28 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
>>
>>> Should I consider this a bug and fix it? I'm very surprised nobody has
>>> run into this before. Or is this considered "correct" by Avro, and it just
>>> happens that Avro violates the JSON spec? IMHO I'd go with the former, but
>>> I'd love input from the powers at be.
>>>
>>>
>>> 2013/4/4 Francis Galiegue <[EMAIL PROTECTED]>
>>>
>>>> On Thu, Apr 4, 2013 at 6:21 PM, Jonathan Coveney <[EMAIL PROTECTED]>
>>>> wrote:
>>>> > I think an example is most useful:
>>>> >
>>>> > https://gist.github.com/jcoveney/5311795
>>>> >
>>>> > I realize that the python implementation isn't as strict as the Java
>>>> > implementation, though this result is a bit surprising.
>>>> >
>>>> > Basically, is it the case that the Java generic writer expects that
>>>> the Json
>>>> > object's keys will be in the same order as the fields? This is what
>>>> the gist
>>>> > is trying to show. I have a simple record definition, and then two
>>>> identical
>>>> > json objects that match that definition, except for the order.
>>>> >
>>>> > In python this works, which you'd expect, but in Java it does not. I
>>>> get the
>>>> > following:
>>>> >
>>>> > First successful!
>>>> > Exception in thread "main" java.lang.RuntimeException:
>>>> > org.apache.avro.AvroTypeException: Expected field name first got
>>>> second
>>>> >     at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:43)
>>>> > Caused by: org.apache.avro.AvroTypeException: Expected field name
>>>> first got
>>>> > second
>>>> >     at org.apache.avro.io.JsonDecoder.doAction(JsonDecoder.java:437)
>>>> >     at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>>> >     at org.apache.avro.io.JsonDecoder.advance(JsonDecoder.java:121)
>>>> >     at org.apache.avro.io.JsonDecoder.readInt(JsonDecoder.java:148)
>>>> >     at
>>>> >
>>>> org.apache.avro.io.ValidatingDecoder.readInt(ValidatingDecoder.java:83)
>>>> >     at
>>>> >
>>>> org.apache.avro.generic.GenericDatumReader.readInt(GenericDatumReader.java:341)
>>>> >     at
>>>> >
>>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:146)
>>>> >     at
>>>> >
>>>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>>>> >     at
>>>> >
>>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>>>> >     at
>>>> >
>>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>>>> >     at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:38)
>>>> >
>>>> > Am I doing something dumb wrong? Per the JSON spec, objects are
>>>> unordered so
>>>> > it seems very problematic that it is expecting it to be ordered.
>>>> >
>>>> > Thank you,
>>>> > Jon
>>>>
>>>> Indeed, this contradicts the JSON spec. Order does not matter in JSON.
>>>>
>>>> Jackson however deserializes JSON with a LinkedHashMap by default. I
>>>> suppose Avro takes advantage of this, but it still contradicts the