Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Do the values in the json object have to be ordered?


Copy link to this message
-
Re: Do the values in the json object have to be ordered?
Jonathan Coveney 2013-04-04, 20:10
Yeah, I'd love to have Doug's thoughts.

Short of a bug fix, to work around I guess I can provide my own decoder?
That seems like a bit of work though. I guess I could also make a builder
for my schemas, and then traverse the json map and build it up? I guess
making my own decoder would be less work than that.

Would appreciate any thoughts on a good workaround, or if I should just try
to patch it (assuming it is a bug) and backport the fix (something which I
would like to avoid, but will do if I have to).
2013/4/4 Philip Zeyliger <[EMAIL PROTECTED]>

> It smells like a bug to me.  Doug typically has more insight here about
> the Java implementation.  I'm mainly a user of the Specific* hierarchy and
> not the Generic one.
>
> -- Philip
>
>
> On Thu, Apr 4, 2013 at 10:28 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
>
>> Should I consider this a bug and fix it? I'm very surprised nobody has
>> run into this before. Or is this considered "correct" by Avro, and it just
>> happens that Avro violates the JSON spec? IMHO I'd go with the former, but
>> I'd love input from the powers at be.
>>
>>
>> 2013/4/4 Francis Galiegue <[EMAIL PROTECTED]>
>>
>>> On Thu, Apr 4, 2013 at 6:21 PM, Jonathan Coveney <[EMAIL PROTECTED]>
>>> wrote:
>>> > I think an example is most useful:
>>> >
>>> > https://gist.github.com/jcoveney/5311795
>>> >
>>> > I realize that the python implementation isn't as strict as the Java
>>> > implementation, though this result is a bit surprising.
>>> >
>>> > Basically, is it the case that the Java generic writer expects that
>>> the Json
>>> > object's keys will be in the same order as the fields? This is what
>>> the gist
>>> > is trying to show. I have a simple record definition, and then two
>>> identical
>>> > json objects that match that definition, except for the order.
>>> >
>>> > In python this works, which you'd expect, but in Java it does not. I
>>> get the
>>> > following:
>>> >
>>> > First successful!
>>> > Exception in thread "main" java.lang.RuntimeException:
>>> > org.apache.avro.AvroTypeException: Expected field name first got second
>>> >     at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:43)
>>> > Caused by: org.apache.avro.AvroTypeException: Expected field name
>>> first got
>>> > second
>>> >     at org.apache.avro.io.JsonDecoder.doAction(JsonDecoder.java:437)
>>> >     at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>> >     at org.apache.avro.io.JsonDecoder.advance(JsonDecoder.java:121)
>>> >     at org.apache.avro.io.JsonDecoder.readInt(JsonDecoder.java:148)
>>> >     at
>>> > org.apache.avro.io.ValidatingDecoder.readInt(ValidatingDecoder.java:83)
>>> >     at
>>> >
>>> org.apache.avro.generic.GenericDatumReader.readInt(GenericDatumReader.java:341)
>>> >     at
>>> >
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:146)
>>> >     at
>>> >
>>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>>> >     at
>>> >
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>>> >     at
>>> >
>>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>>> >     at com.spotify.hadoop.mapred.Hrm.main(Hrm.java:38)
>>> >
>>> > Am I doing something dumb wrong? Per the JSON spec, objects are
>>> unordered so
>>> > it seems very problematic that it is expecting it to be ordered.
>>> >
>>> > Thank you,
>>> > Jon
>>>
>>> Indeed, this contradicts the JSON spec. Order does not matter in JSON.
>>>
>>> Jackson however deserializes JSON with a LinkedHashMap by default. I
>>> suppose Avro takes advantage of this, but it still contradicts the
>>> spec.
>>>
>>> --
>>> Francis Galiegue, [EMAIL PROTECTED]
>>> JSON Schema in Java: http://json-schema-validator.herokuapp.com
>>>
>>
>>
>