|
|
-
Question about Avro "records"
Francis Galiegue 2013-02-27, 21:47
This is not written black on white in the spec, so this is a guess: in its JSON representation, fields defined in a record are the only permissible fields. And there can be zero fields, in which the JSON representation is just the empty object. Is this correct? -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-27, 21:47
-
Re: Question about Avro "records"
Pankaj Shroff 2013-02-27, 21:54
That doesn't seem the case specially because if you define a record with a bunch of optional fields, then you would end up with an empty object (or rather an object with null values for its fields). Am I misunderstanding your question? On Wed, Feb 27, 2013 at 4:47 PM, Francis Galiegue <[EMAIL PROTECTED]>wrote: > This is not written black on white in the spec, so this is a guess: in > its JSON representation, fields defined in a record are the only > permissible fields. And there can be zero fields, in which the JSON > representation is just the empty object. > > Is this correct? > > -- > Francis Galiegue, [EMAIL PROTECTED] > JSON Schema in Java: http://json-schema-validator.herokuapp.com> -- Pankaj Shroff [EMAIL PROTECTED]
+
Pankaj Shroff 2013-02-27, 21:54
-
Re: Question about Avro "records"
Francis Galiegue 2013-02-27, 22:14
On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <[EMAIL PROTECTED]> wrote: > That doesn't seem the case specially because if you define a record with a > bunch of optional fields, then you would end up with an empty object (or > rather an object with null values for its fields). Am I misunderstanding > your question? > OK, I have probably misworded the question. Let's say I have a record defining fields "a" and "b". For simplicity, their permissible values are ints. As I understand it: { "a": 1 } is not legal since "b" is not provided. This: { "a": 1, "b": 2, "c": 3 } is not legal either since "c" is not defined. BUT: { "a": 1 } can be legal IF a default value is provided for "b". Am I getting this right, partially right, completely wrong? -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-27, 22:14
-
Re: Question about Avro "records"
Pankaj Shroff 2013-02-27, 22:21
Yes thats right, and the default value can be "null" - which is what makes "b" an "optional" field. You can define an optional field by defining it of type "union" in an Avro schema where the first type in the union is "null" and the second type is "long" or integer in your case. Something like this (.avsc or .avpr file would have the following Json): { "type": "record", "name": "OptionalFieldsExample", "fields": [ {"name": "a", "type": "long"}, {"name": "b", "type": ["null", "long"]}, {"name": "c", "type": ["null", "long"]} ]} On Wed, Feb 27, 2013 at 5:14 PM, Francis Galiegue <[EMAIL PROTECTED]>wrote: > On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <[EMAIL PROTECTED]> wrote: > > That doesn't seem the case specially because if you define a record with > a > > bunch of optional fields, then you would end up with an empty object (or > > rather an object with null values for its fields). Am I misunderstanding > > your question? > > > > OK, I have probably misworded the question. Let's say I have a record > defining fields "a" and "b". For simplicity, their permissible values > are ints. > > As I understand it: > > { "a": 1 } > > is not legal since "b" is not provided. > > This: > > { "a": 1, "b": 2, "c": 3 } > > is not legal either since "c" is not defined. > > BUT: { "a": 1 } can be legal IF a default value is provided for "b". > > Am I getting this right, partially right, completely wrong? > > -- > Francis Galiegue, [EMAIL PROTECTED] > JSON Schema in Java: http://json-schema-validator.herokuapp.com> -- Pankaj Shroff [EMAIL PROTECTED]
+
Pankaj Shroff 2013-02-27, 22:21
-
Re: Question about Avro "records"
Francis Galiegue 2013-02-27, 22:52
On Wed, Feb 27, 2013 at 11:21 PM, Pankaj Shroff <[EMAIL PROTECTED]> wrote: > Yes thats right, and the default value can be "null" - which is what makes > "b" an "optional" field. > > You can define an optional field by defining it of type "union" in an Avro > schema where the first type in the union is "null" and the second type is > "long" or integer in your case. > > Something like this (.avsc or .avpr file would have the following Json): > > > { > "type": "record", > "name": "OptionalFieldsExample", > "fields": [ > {"name": "a", "type": "long"}, > {"name": "b", "type": ["null", "long"]}, > {"name": "c", "type": ["null", "long"]} > > > ] > } > Is that a reader's or a writer's schema? Sorry for the newbie questions... -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-27, 22:52
-
Re: Question about Avro "records"
Doug Cutting 2013-02-27, 22:20
That sounds right to me. To be clear, the schema in question here is the writer's. A reader schema which did not have "c" could read this, dropping the "c" values from the writer's schema. Doug On Wed, Feb 27, 2013 at 2:14 PM, Francis Galiegue <[EMAIL PROTECTED]> wrote: > On Wed, Feb 27, 2013 at 10:54 PM, Pankaj Shroff <[EMAIL PROTECTED]> wrote: >> That doesn't seem the case specially because if you define a record with a >> bunch of optional fields, then you would end up with an empty object (or >> rather an object with null values for its fields). Am I misunderstanding >> your question? >> > > OK, I have probably misworded the question. Let's say I have a record > defining fields "a" and "b". For simplicity, their permissible values > are ints. > > As I understand it: > > { "a": 1 } > > is not legal since "b" is not provided. > > This: > > { "a": 1, "b": 2, "c": 3 } > > is not legal either since "c" is not defined. > > BUT: { "a": 1 } can be legal IF a default value is provided for "b". > > Am I getting this right, partially right, completely wrong? > > -- > Francis Galiegue, [EMAIL PROTECTED] > JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Doug Cutting 2013-02-27, 22:20
-
Re: Question about Avro "records"
Francis Galiegue 2013-02-27, 22:51
On Wed, Feb 27, 2013 at 11:20 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > That sounds right to me. To be clear, the schema in question here is > the writer's. A reader schema which did not have "c" could read this, > dropping the "c" values from the writer's schema. > Hmm, OK, the reader/writer distinction is something I am not accustomed to. Who can "produce" default values? The reader, the writer or both? -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-27, 22:51
+
Doug Cutting 2013-02-27, 23:10
-
Re: Question about Avro "records"
Francis Galiegue 2013-02-27, 23:55
On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <[EMAIL PROTECTED]> wrote: > On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <[EMAIL PROTECTED]> wrote: >> Hmm, OK, the reader/writer distinction is something I am not accustomed to. > > http://avro.apache.org/docs/current/spec.html#Schema+Resolution> > Doug Hmmm, that does not quite answer my question about production of default values. For instance, when emitting data from an avro schema which reads: { "type": "record", "name": "whatever", "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ] } is emitting {} legal? -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-27, 23:55
-
Re: Question about Avro "records"
Doug Cutting 2013-02-28, 01:12
I don't think that's valid. In general a value should be written for each field in the writer's schema. It's not ideal for json, but those are the rules for binary and it's best if such read/write logic can be unaware of whether json or binary are being produced. I suspect such json would break existing implementations, but I've not checked. Doug On Wed, Feb 27, 2013 at 3:55 PM, Francis Galiegue <[EMAIL PROTECTED]> wrote: > On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <[EMAIL PROTECTED]> wrote: >> On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <[EMAIL PROTECTED]> wrote: >>> Hmm, OK, the reader/writer distinction is something I am not accustomed to. >> >> http://avro.apache.org/docs/current/spec.html#Schema+Resolution>> >> Doug > > Hmmm, that does not quite answer my question about production of default values. > > For instance, when emitting data from an avro schema which reads: > > { > "type": "record", > "name": "whatever", > "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ] > } > > is emitting {} legal? > > > -- > Francis Galiegue, [EMAIL PROTECTED] > JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Doug Cutting 2013-02-28, 01:12
-
Re: Question about Avro "records"
Jeremy Kahn 2013-02-28, 01:30
There seems to be no way to easily use the avro libraries in Python (where I feel most qualified to comment) to encode generics with "missing default values" and have them transmitted in well-formed avro binary. If you fill in the "missing" default values, the Python libraries will transmit correctly. I'd be happy to add methods to the avro.RecordSchema objects (in the Python libraries) that "fill defaults" on missing member fields of a record, recursively (which probably means method extension of other schema classes as well). Shall I open a JIRA ticket for this for 1.7.5? (Does providing this for Python pot me on the hook for such a thing in other implementation languages? I hope not.) For backwards compatibility (and probably to avoid unnecessary data traversal), you'll probably want to explicitly ask the schema to fill in defaults before transmission in the cases where you'd like to generate the impoverished JSON from your example. This seems related to earlier discussion today about designing constructors to generate defaults already filled in. Jeremy On Feb 27, 2013 3:55 PM, "Francis Galiegue" <[EMAIL PROTECTED]> wrote: > On Thu, Feb 28, 2013 at 12:10 AM, Doug Cutting <[EMAIL PROTECTED]> wrote: > > On Wed, Feb 27, 2013 at 2:51 PM, Francis Galiegue <[EMAIL PROTECTED]> > wrote: > >> Hmm, OK, the reader/writer distinction is something I am not accustomed > to. > > > > http://avro.apache.org/docs/current/spec.html#Schema+Resolution> > > > Doug > > Hmmm, that does not quite answer my question about production of default > values. > > For instance, when emitting data from an avro schema which reads: > > { > "type": "record", > "name": "whatever", > "fields": [ { "name": "a", "type": { "type": "int", "default": 0 } } ] > } > > is emitting {} legal? > > > -- > Francis Galiegue, [EMAIL PROTECTED] > JSON Schema in Java: http://json-schema-validator.herokuapp.com>
+
Jeremy Kahn 2013-02-28, 01:30
-
Re: Question about Avro "records"
Doug Cutting 2013-02-28, 17:11
On Wed, Feb 27, 2013 at 5:30 PM, Jeremy Kahn <[EMAIL PROTECTED]> wrote: > I'd be happy to add methods to the avro.RecordSchema objects (in the Python > libraries) that "fill defaults" on missing member fields of a record, > recursively (which probably means method extension of other schema classes > as well). Shall I open a JIRA ticket for this for 1.7.5?
That would be great. Thanks!
> (Does providing this for Python pot me on the hook for such a thing in other > implementation languages? I hope not.)
Not at all.
Cheers,
Doug
+
Doug Cutting 2013-02-28, 17:11
-
Re: Question about Avro "records"
Francis Galiegue 2013-02-28, 01:40
On Thu, Feb 28, 2013 at 2:30 AM, Jeremy Kahn <[EMAIL PROTECTED]> wrote: > There seems to be no way to easily use the avro libraries in Python (where I > feel most qualified to comment) to encode generics with "missing default > values" and have them transmitted in well-formed avro binary. > > If you fill in the "missing" default values, the Python libraries will > transmit correctly. > > I'd be happy to add methods to the avro.RecordSchema objects (in the Python > libraries) that "fill defaults" on missing member fields of a record, > recursively (which probably means method extension of other schema classes > as well). Shall I open a JIRA ticket for this for 1.7.5? > > (Does providing this for Python pot me on the hook for such a thing in other > implementation languages? I hope not.) > > For backwards compatibility (and probably to avoid unnecessary data > traversal), you'll probably want to explicitly ask the schema to fill in > defaults before transmission in the cases where you'd like to generate the > impoverished JSON from your example. > > This seems related to earlier discussion today about designing constructors > to generate defaults already filled in. > In fact, I was just asking how this should be handled because I have just finished writing an Avro schema to JSON Schema conversion processor (which I'll put online soon), so I wanted to be as accurate as possible when generating schemas ;) Right now the generated schemas require all properties, even ones having default values. I was wondering if that was the right thing to do... (and next I'm attacking the reverse: JSON Schema to Avro schema...) -- Francis Galiegue, [EMAIL PROTECTED] JSON Schema in Java: http://json-schema-validator.herokuapp.com
+
Francis Galiegue 2013-02-28, 01:40
|
|