Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> How to add optional new record fields and/or new methods in avro-ipc?


Copy link to this message
-
Re: How to add optional new record fields and/or new methods in avro-ipc?


On 10/18/11 2:36 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:

>On 10/18/2011 11:40 AM, Scott Carey wrote:
>> I'm not sure that I understand "The default value for a union is the
>> default value for its first branch."
>> Defaults don't apply to any types in a union, only to fields on a
>>record.
>> So the Schema ["Foo", "Bar"] can have no default, nor can any of its
>> branches.
>
>From the spec's definition of default values:
>
>"Permitted values depend on the field's schema type, according to the
>table below. Default values for union fields correspond to the first
>schema in the union."

In that case, Avro Java does not adhere to the spec.  I have had things
like this as a field in Schemas I use for a long time (> 1 year, likely
nearly 2)

{"name": "clientEnvironment", "type": [
  {"name": "BrowserData", "type": "record", "fields" : [
    {"name": "osName", "type": "string", "default": ""},
    {"name": "browserVersion", "type": "string", "default": ""},
    {"name": "flashVersion", "type": "string", "default": ""}
  ]},
  "null"
], "default":null}
The default, null, is definitely not of type BrowserData.  Changing it to
be strict about this would break much archived data.  It can't be
completely flexible since there is ambiguity on  matching defaults to
types (e.g. the string "\u0040" could be a string, bytes, or fixed).
Perhaps changing the language to "first matching branch" or "first
compatible branch" would work, but then we have to define how the
compatibility works.

>
>So the default value for a field with a union type as its value is
>assumed to be of the type of the first element of that union.  If the
>first element of that union is "null", then the type of the default
>value must be "null" hence the default value itself can only be "null".
> So the question is whether we require that to be stated explicitly in
>the schema.  I assumed that we did not.  Either way, we should clarify
>the spec around this.
>
>We could continue to insist that, if no default value is explicitly
>specified in the reader's schema, and the writer's schema lacks a field,
>then an error is thrown.  Or we could say that the default value for
>default values is null, so that if the reader adds a field that's a
>union with "null" as its first branch then no default value need be
>present.
>
>Doug