Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Primitive type aliases

Jay Hacker 2013-04-12, 19:15
Doug Cutting 2013-04-12, 21:09
Copy link to this message
Re: Primitive type aliases
This annotation behavior would be very useful for representing things like
"age" (a non-negative number), URI (constrained subset of "string") etc.

Doug, when you say "Python doesn't support aliases", what do you mean? What
behavior should it support? I understood aliases to be only used in schema
evolution, and the Python avro libraries seem to correctly respect aliases
when reading from another schema... or don't they?

On Fri, Apr 12, 2013 at 2:09 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:

> Aliases are used for for type names (records, enums, & fixed) and field
> names.  Also, I don't think aliases are implemented in Python.
> You could define a Date record with a single string and use it.  Records
> have no storage overhead, so this will result in the same serialized form
> as a string field.  If you don't want the nested structure in memory, then
> perhaps we should consider an "inline" schema annotation.  This might look
> like:
> {"type":"record", "name":"Date", "inline":true, "fields":[{"name":"value",
> "type":"string"}]}
> {"type":"record", "name":"Test", "fields":[{"name":"date", "type":"Date"}]}
> Then the Python implementation might be altered so that when it reads an
> inline record with a single field then it returns the value of that single
> field, and similarly accepts a value of the field on write.  This would be
> a representation-hint to the runtime, and would not affect the schema
> language or serialization so should be completely compatible.
> Thoughts?
> Doug
> On Fri, Apr 12, 2013 at 12:15 PM, Jay Hacker <[EMAIL PROTECTED]> wrote:
>> I'd like to be able to alias primitive types, for example to indicate
>> that a field of type "date" is really a string that I should treat
>> specially.  The spec says "Named types and fields may have aliases," which
>> suggests it ought to work ("string" is a named type...).
>> I don't really know how to express an alias for a primitive, but things
>> like this:
>> {
>>     "type": "record",
>>     "name": "alias-test",
>>     "fields": [
>>         {"name": "start", "type": {"type": "string", "aliases":
>> ["date"]}},
>>         {"name": "end",   "type": "date"}
>>     ]
>> }
>> don't work (at least not in the Python 1.7.4 implementation: 'Type
>> property "date" not a valid Avro schema').  How can I alias a primitive
>> type, and if not, why not?
>> Thanks.
Doug Cutting 2013-04-12, 21:35
Jay Hacker 2013-04-15, 16:03
Doug Cutting 2013-04-15, 16:26