Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Handling field names when serializing and deserializing JSON


Copy link to this message
-
Re: Handling field names when serializing and deserializing JSON
It might work, I'd have to test it to be sure.  But it's not guaranteed.

Avro names are specified at:

  http://avro.apache.org/docs/current/spec.html#Names

Avro Java accepts more than this, including arbitrary unicode
alphabetic characters.

See https://issues.apache.org/jira/browse/AVRO-1022 for an extensive discussion.

Doug

On Tue, Jan 14, 2014 at 2:45 PM, Pritchard, Charles X. -ND
<[EMAIL PROTECTED]> wrote:
>
> On Jan 14, 2014, at 2:32 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
>> On Tue, Jan 14, 2014 at 2:06 PM, Pritchard, Charles X. -ND
>> <[EMAIL PROTECTED]> wrote:
>>> Do I just pop the “original” field name in as an alias and use the “safe”
>>> (alphanumeric+underscore) one as the primary name?
>>
>> you have Avro data with names that are illegal in Hive then you could
>> provide Hive with a safely-named schema to use when reading these that
>> has the original name as an alias.  Conversely, if Hive writes data
>> with the "safe" name that you want to read as data using the original
>> name, then you'd read with the safe name in an alias.  Does that make
>> sense?
>
> Yes; so we just flop the alias/original name between Hive and other sources.
> Really appreciate the clarification there.
>
> One of the common places this comes up is with hyphens such as: “X-Something” in some JSON schemas.
>
> Thanks for letting me know how to handle this.
>
> On that same topic though — if/when someone does something really awful, like using a dot in the key name,
> is that still going to work out fine with record.get() syntax?
>
> e.g.: { “key”: “val”, “dotted.key”: “val” }
>
> I know that in the context of avro aliases, the dot has special semantics.
>
> (I hope I’m not being too obtuse).
>
> -Charles