Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Handling field names when serializing and deserializing JSON


Copy link to this message
-
Re: Handling field names when serializing and deserializing JSON

On Jan 14, 2014, at 2:32 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:

> On Tue, Jan 14, 2014 at 2:06 PM, Pritchard, Charles X. -ND
> <[EMAIL PROTECTED]> wrote:
>> Do I just pop the “original” field name in as an alias and use the “safe”
>> (alphanumeric+underscore) one as the primary name?
>
> you have Avro data with names that are illegal in Hive then you could
> provide Hive with a safely-named schema to use when reading these that
> has the original name as an alias.  Conversely, if Hive writes data
> with the "safe" name that you want to read as data using the original
> name, then you'd read with the safe name in an alias.  Does that make
> sense?

Yes; so we just flop the alias/original name between Hive and other sources.
Really appreciate the clarification there.

One of the common places this comes up is with hyphens such as: “X-Something” in some JSON schemas.

Thanks for letting me know how to handle this.

On that same topic though — if/when someone does something really awful, like using a dot in the key name,
is that still going to work out fine with record.get() syntax?

e.g.: { “key”: “val”, “dotted.key”: “val” }

I know that in the context of avro aliases, the dot has special semantics.

(I hope I’m not being too obtuse).

-Charles
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB