Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Schema resolution failure when the writer's schema is a primitive type and the reader's schema is a union


Copy link to this message
-
Re: Schema resolution failure when the writer's schema is a primitive type and the reader's schema is a union
It makes sense when you think about it. I guess making that parameter name writerSchema rather than the generic schema name would have been even better but at least I know now.

Thanks Doug!

--
Alex
On Friday, August 31, 2012 at 2:22 PM, Doug Cutting wrote:

> I responded to the Jira, but will respond here too for completeness.
>
> I believe the problem is that the decoder is incorrectly constructed
> with the reader's schema rather than the writer's schema. It should
> instead be constructed in this example with:
>
> JsonDecoder jsonDecoder > DecoderFactory.get().jsonDecoder(writerSchema, output.toString());
>
> With that change this test passes for me.
>
> Doug
>
> On Fri, Aug 31, 2012 at 9:23 AM, Scott Carey <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
> > Yes, please file a bug in JIRA. It will get more attention there.
> >
> > On 8/30/12 11:06 PM, "Alexandre Normand" <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])>
> > wrote:
> >
> > That's one of the things I've tried already. I've reversed the order to
> > ["int", "null"] but I get the same result.
> >
> > Should I file a bug in Jira?
> >
> > --
> > Alex
> >
> > On Thursday, August 30, 2012 at 11:01 PM, Scott Carey wrote:
> >
> > My understanding of the spec is that promotion to a union should work as
> > long as the prior type is a member of the union.
> >
> > What happens if the union in the reader schema union order is reversed?
> >
> > This may be a bug.
> >
> > -Scott
> >
> > On 8/16/12 5:59 PM, "Alexandre Normand" <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])>
> > wrote:
> >
> >
> > Hey,
> > I've been running into this case where I have a field of type int but I
> > need to allow for null values. To do this, I now have a new schema that
> > defines that field as a union of
> > null and int such as:
> > type: [ "null", "int" ]
> > According to my interpretation of the spec, avro should resolve this
> > correctly. For reference, this reads like this (from
> > http://avro.apache.org/docs/current/spec.html#Schema+Resolution):
> >
> > if
> > reader's is a union, but writer's is not
> > The first schema in the reader's union that matches the writer's schema
> > is recursively resolved against it. If none match, an error is signaled.)
> >
> >
> > However, when trying to do this, I get this:
> > org.apache.avro.AvroTypeException: Attempt to process a int when a union
> > was expected.
> >
> > I've written a simple test that illustrates what I'm saying:
> > @Test
> > public void testReadingUnionFromValueWrittenAsPrimitive() throws
> > Exception {
> > Schema writerSchema = new Schema.Parser().parse("{\n" +
> > " \"type\":\"record\",\n" +
> > " \"name\":\"NeighborComparisons\",\n" +
> > " \"fields\": [\n" +
> > " {\"name\": \"test\",\n" +
> > " \"type\": \"int\" }]} ");
> > Schema readersSchema = new Schema.Parser().parse(" {\n" +
> > " \"type\":\"record\",\n" +
> > " \"name\":\"NeighborComparisons\",\n" +
> > " \"fields\": [ {\n" +
> > " \"name\": \"test\",\n" +
> > " \"type\": [\"null\", \"int\"],\n" +
> > " \"default\": null } ] }");
> > GenericData.Record record = new GenericData.Record(writerSchema);
> > record.put("test", Integer.valueOf(10));
> >
> > ByteArrayOutputStream output = new ByteArrayOutputStream();
> > JsonEncoder jsonEncoder > > EncoderFactory.get().jsonEncoder(writerSchema, output);
> > GenericDatumWriter<GenericData.Record> writer = new
> > GenericDatumWriter<GenericData.Record>(writerSchema);
> > writer.write(record, jsonEncoder);
> > jsonEncoder.flush();
> > output.flush();
> >
> > System.out.println(output.toString());
> >
> > JsonDecoder jsonDecoder > > DecoderFactory.get().jsonDecoder(readersSchema, output.toString());
> > GenericDatumReader<GenericData.Record> reader > > new GenericDatumReader<GenericData.Record>(writerSchema,
> > readersSchema);
> > GenericData.Record read = reader.read(null, jsonDecoder);
> > assertEquals(10, read.get("test"));
> > }
> >
> > Am I misunderstanding how avro should handle such a case of schema
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB