Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Issue with a union with bytes and DataFileReader?


Copy link to this message
-
Re: Issue with a union with bytes and DataFileReader?
This has been previously reported as:

https://issues.apache.org/jira/browse/AVRO-1275

Please also note that GenericData#toString() does not always produce
output that JsonDecoder can read.  If you're using JsonDecoder then
you should also use JsonEncoder.  That said, some folks don't like the
way that those classes encode unions and prefer the JSON that
GenericData#toString() generates.

A union between, e.g., a string an an enum can produce ambiguous json.
 To resolve this, JsonEncoder/Decoder tags union values (except unions
with null) with the intended type.  A union between string and an enum
named Flavor with values SWEET and SOUR might be rendered by
JsonEncoder as {"string":"SOUR"} or {"Flavor":"SOUR"}, while
GenericData#toString() would print "SOUR" in both cases.

The wrapping of all "bytes" values in {"bytes": ...} by
GenericData#toString() is separate and should probably be considered a
bug.  Unfortunately fixing it would be an incompatible change, so
should probably wait until release 1.8.

Doug

On Thu, Apr 25, 2013 at 6:26 AM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
> This should replicate the issue on 1.7.4:
> https://gist.github.com/jcoveney/5459644
>
> Basically, when using DataFileReader to read a union of bytes, it's
> outputting in the form of {"bytes": "<thebytes>"}, which it doesn't do for
> any other union types.
>
> Is this expected? Is this a bug?
>
> I appreciate your help,
> Jon