|
Saptarshi Guha
2012-06-25, 06:17
Scott Carey
2012-06-26, 01:42
Saptarshi Guha
2012-06-26, 02:27
Saptarshi Guha
2012-06-26, 02:45
Douglas Creager
2012-06-26, 12:02
|
-
C/C++ parsing vs. Java parsing.Saptarshi Guha 2012-06-25, 06:17
I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671
I tried java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo and it worked. This failed: avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2 Segmentation fault: 11 This failed: avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t)); (avro_schema_from_json_literal(string.of.avro.file), person_schema) with Error was Error parsing JSON: string or '}' expected near end of file Q1: Does C and C++ API support all schemas the Java one supports? Q2: Is it yes to Q1 and this is a bug? Regards Saptarshi
-
Re: C/C++ parsing vs. Java parsing.Scott Carey 2012-06-26, 01:42
The schema provided is a union of several schemas. Java supports parsing
this, C++ may not. Does it work if you make it one single schema, and nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to be defined the first time it is referenced. If it does not, then it is certainly a bug. Either way I would file a bug in JIRA. The spec does not say whether a file should be parseable if it contains a union rather than a record, but it probably should be. -Scott On 6/24/12 11:17 PM, "Saptarshi Guha" <[EMAIL PROTECTED]> wrote: >I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671 > >I tried > >java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo > >and it worked. > >This failed: > >avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2 >Segmentation fault: 11 > > >This failed: > > avro_schema_t *person_schema >(avro_schema_t*)malloc(sizeof(avro_schema_t)); >(avro_schema_from_json_literal(string.of.avro.file), person_schema) > >with > >Error was Error parsing JSON: string or '}' expected near end of file > >Q1: Does C and C++ API support all schemas the Java one supports? >Q2: Is it yes to Q1 and this is a bug? > >Regards >Saptarshi
-
Re: C/C++ parsing vs. Java parsing.Saptarshi Guha 2012-06-26, 02:27
Hi Scott,
Thanks for the response. I changed the avro file to [1] 1. Java works. 2. avrocppgen avrogencpp -i ~/tmp/robject.avro -o foo works. 3. C avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t)); (avro_schema_from_json_literal(jsonstring, person_schema)) returns: Error was Error parsing JSON: string or '}' expected near end of file So is this a bug? or am i calling it wrong. Ideally, i would like a union of ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST"]}} Each of these is a record of a 1) a type (might be array of integers, though COMPLEX is array of records) and (2) another field called Attributes. e.g [ {"type":"record", "name":"REAL", "fields":[ {"name":"whattype", "type":"myrtype"}, {"name":"value", "type":"array" , "items":"double"}, {"name":"attrs" , "type":"attrytpe"} ] }, {"type":"record", "name":"INTEGER", "fields":[ {"name":"whattype", "type":"myrtype"}, {"name":"value", "type":"array" , "items":"integers"}, {"name":"attrs" , "type":"attrytpe"} ] } ,... ] Here 'attrytpe' is a Map type defined elsewhere and "myrtype" is an enum defined elsewhere. Similarly for a complex one in the union, it's 'values' field will be an array of "complex type" defined elsewhere? Woud i need multiple avro files using the same namespace? or this the serialized the equivalent of what i have before [1]? Thanks for your time Saptarshi [1] { "namespace": "robjects.avro", "type": "record", "name": "robject", "doc" : "Encoding of some of the R data types", "fields": [ {"name":"typeof" ,"type":{"type":"enum", "name":"thetype" ,"symbols": ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST","ATTRIBUTES"]}}, {"name":"NAtype" ,"type":{"type":"enum" , "name":"NA" ,"symbols":["NA"]}}, {"name":"complextype","type":{"type":"record" , "name":"complex", "fields":[ {"name":"re", "type":"double"}, {"name":"im", "type":"double"} ]}}, {"name":"NULL" ,"type":"null"}, {"name":"RAW" ,"type":["null",{"type":"array" ,"items":"bytes"}]}, {"name":"INTEGER" ,"type":["null",{"type":"array" ,"items":"int"}]}, {"name":"REAL" ,"type":["null",{"type":"array" ,"items":"double"}]}, {"name":"COMPLEX" ,"type":["null",{"type":"array" ,"items":"complex"}]}, {"name":"LOGICAL" ,"type":["null",{"type":"array" ,"items":["boolean","NA"]}]}, {"name":"STRING" ,"type":["null",{"type":"array" ,"items":["string","NA"]}]}, {"name":"LIST" ,"type":["null",{"type":"array" ,"items":["robject"]}]}, {"name":"ATTRIBUTES" ,"type":["null",{"type":"map" ,"values":"robject"}]} ] } ----- Original Message ----- From: "Scott Carey" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], "Saptarshi Guha" <[EMAIL PROTECTED]> Sent: Monday, June 25, 2012 9:42:27 PM Subject: Re: C/C++ parsing vs. Java parsing. The schema provided is a union of several schemas. Java supports parsing this, C++ may not. Does it work if you make it one single schema, and nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to be defined the first time it is referenced. If it does not, then it is certainly a bug. Either way I would file a bug in JIRA. The spec does not say whether a file should be parseable if it contains a union rather than a record, but it probably should be. -Scott On 6/24/12 11:17 PM, "Saptarshi Guha" <[EMAIL PROTECTED]> wrote: >I have a avro scheme found here: http://sguha.pastebin.mozilla.org/1677671 > >I tried > >java -jar avro-tools-1.7.0.jar compile schema ~/tmp/robject.avro foo > >and it worked. > >This failed: > >avrogencpp --input ~/tmp/robject.avro --output ~/tmp/h2 >Segmentation fault: 11 > > >This failed: > > avro_schema_t *person_schema >(avro_schema_t*)malloc(sizeof(avro_schema_t)); >(avro_schema_from_json_literal(string.of.avro.file), person_schema) > >with > >Error was Error parsing JSON: string or '}' expected near end of file
-
Re: C/C++ parsing vs. Java parsing.Saptarshi Guha 2012-06-26, 02:45
I should mention,
a) I need Java and C - because the messages will be consumed by Java and C b) I'd rather stay away from C++ because of the Boost dependency - nothing against it just becomes another installation hurdle c) I need to check with other languages e.g. Python since i look forward to language interop. Thanks again Saptarshi ----- Original Message ----- From: "Saptarshi Guha" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Monday, June 25, 2012 10:27:45 PM Subject: Re: C/C++ parsing vs. Java parsing. Hi Scott, Thanks for the response. I changed the avro file to [1] 1. Java works. 2. avrocppgen avrogencpp -i ~/tmp/robject.avro -o foo works. 3. C avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t)); (avro_schema_from_json_literal(jsonstring, person_schema)) returns: Error was Error parsing JSON: string or '}' expected near end of file So is this a bug? or am i calling it wrong. Ideally, i would like a union of ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST"]}} Each of these is a record of a 1) a type (might be array of integers, though COMPLEX is array of records) and (2) another field called Attributes. e.g [ {"type":"record", "name":"REAL", "fields":[ {"name":"whattype", "type":"myrtype"}, {"name":"value", "type":"array" , "items":"double"}, {"name":"attrs" , "type":"attrytpe"} ] }, {"type":"record", "name":"INTEGER", "fields":[ {"name":"whattype", "type":"myrtype"}, {"name":"value", "type":"array" , "items":"integers"}, {"name":"attrs" , "type":"attrytpe"} ] } ,... ] Here 'attrytpe' is a Map type defined elsewhere and "myrtype" is an enum defined elsewhere. Similarly for a complex one in the union, it's 'values' field will be an array of "complex type" defined elsewhere? Woud i need multiple avro files using the same namespace? or this the serialized the equivalent of what i have before [1]? Thanks for your time Saptarshi [1] { "namespace": "robjects.avro", "type": "record", "name": "robject", "doc" : "Encoding of some of the R data types", "fields": [ {"name":"typeof" ,"type":{"type":"enum", "name":"thetype" ,"symbols": ["NULL","RAW","INTEGER","REAL","COMPLEX","LOGICAL","STRING","LIST","ATTRIBUTES"]}}, {"name":"NAtype" ,"type":{"type":"enum" , "name":"NA" ,"symbols":["NA"]}}, {"name":"complextype","type":{"type":"record" , "name":"complex", "fields":[ {"name":"re", "type":"double"}, {"name":"im", "type":"double"} ]}}, {"name":"NULL" ,"type":"null"}, {"name":"RAW" ,"type":["null",{"type":"array" ,"items":"bytes"}]}, {"name":"INTEGER" ,"type":["null",{"type":"array" ,"items":"int"}]}, {"name":"REAL" ,"type":["null",{"type":"array" ,"items":"double"}]}, {"name":"COMPLEX" ,"type":["null",{"type":"array" ,"items":"complex"}]}, {"name":"LOGICAL" ,"type":["null",{"type":"array" ,"items":["boolean","NA"]}]}, {"name":"STRING" ,"type":["null",{"type":"array" ,"items":["string","NA"]}]}, {"name":"LIST" ,"type":["null",{"type":"array" ,"items":["robject"]}]}, {"name":"ATTRIBUTES" ,"type":["null",{"type":"map" ,"values":"robject"}]} ] } ----- Original Message ----- From: "Scott Carey" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], "Saptarshi Guha" <[EMAIL PROTECTED]> Sent: Monday, June 25, 2012 9:42:27 PM Subject: Re: C/C++ parsing vs. Java parsing. The schema provided is a union of several schemas. Java supports parsing this, C++ may not. Does it work if you make it one single schema, and nest "NA", "acomplex" and "retypes" inside of "object" ? It only needs to be defined the first time it is referenced. If it does not, then it is certainly a bug. Either way I would file a bug in JIRA. The spec does not say whether a file should be parseable if it contains a union rather than a record, but it probably should be. -Scott On 6/24/12 11:17 PM, "Saptarshi Guha" <[EMAIL PROTECTED]> wrote:
-
Re: C/C++ parsing vs. Java parsing.Douglas Creager 2012-06-26, 12:02
> 3. C
> > avro_schema_t *person_schema = (avro_schema_t*)malloc(sizeof(avro_schema_t)); > (avro_schema_from_json_literal(jsonstring, person_schema)) > > returns: > > Error was Error parsing JSON: string or '}' expected near end of file > > So is this a bug? or am i calling it wrong. That error message is from the JSON parser we use internally — it claims that there's a syntax error in the JSON that you've passed in. Can you send us the snippet where you define jsonstring? It might be an issue of escaping things correctly in the C string literal. Also, there's a comment where avro_schema_from_json_literal is defined, saying that jsonstring must be defined as a "char[]" and not a "char *". And of course it could also be an actual syntax error. :-) –doug |