Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Jackson and Avro, nested schema


+
David Arthur 2013-05-08, 18:49
Copy link to this message
-
Re: Jackson and Avro, nested schema
It seems to me that you defined "fields" as an Array (an IndexedRecord) but
you provided input as a single Record. It might help if you change your
JSON document so that "fields" is an array with one element in it (notice
the additional square bracktes [ ] for array notation):

"fields" : [  { "foo": "bar", "spam": "eggs",
                  "answer": 42,
                  "x": {"a": 1}
                 }
              ]

Have you tried this input and does it work if you did?

Pankaj

On Wed, May 8, 2013 at 2:49 PM, David Arthur <[EMAIL PROTECTED]> wrote:

> I'm attempting to use Jackson and Avro together to map JSON documents to a
> generated Avro class. I have looked at the Json schema included with Avro,
> but this requires a top-level "value" element which I don't want.
> Essentially, I have JSON documents that have a few typed top level fields,
> and one field called "fields" which is more or less arbitrary JSON.
>
> I've reduced this down to strings and ints for simplicity
>
> My first attempt was:
>
>  {
>     "type": "record",
>     "name": "Json",
>     "fields": [
>       {
>         "name": "value",
>         "type": [ "string", "int", {"type": "map", "values": "Json"} ]
>       }
>     ]
>   },
>
>   {
>     "name": "Document",
>     "type": "record",
>     "fields": [
>       {
>         "name": "id",
>         "type": "string"
>       },
>       {
>         "name": "fields",
>         "type": {"type": "map", "values": ["string", "int", {"type":
> "map", "values": "Json"}]}
>       }
>     ]
>   }
>
> Given a JSON document like:
>
> {
>   "id": "doc1",
>   "fields": {
>     "foo": "bar",
>     "spam": "eggs",
>     "answer": 42,
>     "x": {"a": 1}
>   }
> }
>
> this seems to work, but it doesn't. When I turn around and try to
> serialize this object with Avro, I get the following exception:
>
> java.lang.ClassCastException: java.lang.Integer cannot be cast to
> org.apache.avro.generic.**IndexedRecord
>     at org.apache.avro.generic.**GenericData.getField(**
> GenericData.java:526)
>     at org.apache.avro.generic.**GenericData.getField(**
> GenericData.java:541)
>     at org.apache.avro.generic.**GenericDatumWriter.**writeRecord(**
> GenericDatumWriter.java:104)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:66)
>     at org.apache.avro.generic.**GenericDatumWriter.writeMap(**
> GenericDatumWriter.java:173)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:69)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:73)
>     at org.apache.avro.generic.**GenericDatumWriter.writeMap(**
> GenericDatumWriter.java:173)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:69)
>     at org.apache.avro.generic.**GenericDatumWriter.**writeRecord(**
> GenericDatumWriter.java:106)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:66)
>     at org.apache.avro.generic.**GenericDatumWriter.write(**
> GenericDatumWriter.java:58)
>
> My best guess is that since the "fields" field is a union, the
> representation of it in the generate class is an Object which Jackson
> happily throws whatever into.
>
> If I change my schema to explicitly use "int" instead of the "Json" type,
> it works fine for my test document
>
>         "type": {"type": "map", "values": ["string", "int", {"type":
> "map", "values": "int"}]}
>
> However now I need to enumerate the types for each level of nesting I
> want. This is not ideal, and limits me to a fixed level of nesting
>
> To be clear, my issue is not modelling my schema in Avro, but rather
> getting Jackson to map JSON onto the generated classes without too much
> pain. I have also tried https://github.com/FasterXML/**
> jackson-dataformat-avro<https://github.com/FasterXML/jackson-dataformat-avro>without much luck.
>
> Any help is appreciated
>
> -David
>
>
>
>
>
>
--
Pankaj Shroff
[EMAIL PROTECTED]
+
Scott Carey 2013-05-13, 21:51
+
Doug Cutting 2013-05-13, 23:25
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB