|
Robert Minsk
2012-12-20, 21:25
Martin Kleppmann
2012-12-21, 11:43
Robert Minsk
2012-12-21, 19:18
Doug Cutting
2012-12-21, 19:42
Robert Minsk
2012-12-21, 19:50
Doug Cutting
2012-12-21, 20:03
Robert Minsk
2012-12-21, 19:59
|
-
unsigned integer typesRobert Minsk 2012-12-20, 21:25
I am currently testing Avro for our network serialization for a mix of
C++ and python programs. I have noticed that Avro does not offer an unsigned 32-bit or unsigned 64-bit integer types. How are people currently handling unsigned integers? Are there any plans to add unsigned integer types? -- Robert Minsk Systems and Software Engineer WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> 730 Arizona Ave, Santa Monica, CA 90401 O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 <tel:+13104346501> Los Angeles <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential information. If you are not the intended recipient of this e-mail, you are hereby notified any dissemination, distribution or copying of this email and any attachments is strictly prohibited. If you receive this email in error, please immediately notify the sender by return email and permanently delete the original, any copy and any printout thereof. The integrity and security of e-mail cannot be guaranteed. +
Robert Minsk 2012-12-20, 21:25
-
Re: unsigned integer typesMartin Kleppmann 2012-12-21, 11:43
If your numbers are typically small, you can just use a signed type —
the sign bit's overhead is insignificant. If your numbers typically use the full range of 0 to 2^64-1 (or 0 to 2^32-1), e.g. because they are hashes or random numbers from that range, you're best off using the 'fixed' type and specifying the number of bytes you want. In this case 'fixed' is more efficient than the variable-length encoding of int/long, because there is no overhead for indicating the length; it's simply stored as that number of bytes, and nothing else. Because those two options cover most use cases, I don't think there are any plans to add unsigned int support to Avro. Martin On 20 December 2012 13:25, Robert Minsk <[EMAIL PROTECTED]> wrote: > > I am currently testing Avro for our network serialization for a mix of C++ > and python programs. I have noticed that Avro does not offer an unsigned > 32-bit or unsigned 64-bit integer types. How are people currently handling > unsigned integers? Are there any plans to add unsigned integer types? > > -- > Robert Minsk > Systems and Software Engineer > > WWW.METHODSTUDIOS.COM > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 // F:+1 310 434 6501 > > +
Martin Kleppmann 2012-12-21, 11:43
-
Re: unsigned integer typesRobert Minsk 2012-12-21, 19:18
Fixed does not seem to work.
avro-1.7.3 fixed.json: { "type": "record", "name": "fixed", "fields" : [ {"name": "foo", "type": "fixed", "size": 8} ] } ./avrogencpp -i fixed.json -o fixed.hh Segmentation fault (core dumped) If I change the fixed.json to: { "type": "record", "name": "test_fixed", "fields" : [ {"name": "foo", "type": "fixed", "size": 8} ] } ./avrogencpp -i fixed.json -o fixed.hh Failed to parse or compile schema: Unknown type: fixed On 12/21/2012 03:43 AM, Martin Kleppmann wrote: > If your numbers are typically small, you can just use a signed type — > the sign bit's overhead is insignificant. > > If your numbers typically use the full range of 0 to 2^64-1 (or 0 to > 2^32-1), e.g. because they are hashes or random numbers from that > range, you're best off using the 'fixed' type and specifying the > number of bytes you want. In this case 'fixed' is more efficient than > the variable-length encoding of int/long, because there is no overhead > for indicating the length; it's simply stored as that number of bytes, > and nothing else. > > Because those two options cover most use cases, I don't think there > are any plans to add unsigned int support to Avro. > > Martin > > > On 20 December 2012 13:25, Robert Minsk <[EMAIL PROTECTED]> > wrote: >> I am currently testing Avro for our network serialization for a mix of C++ >> and python programs. I have noticed that Avro does not offer an unsigned >> 32-bit or unsigned 64-bit integer types. How are people currently handling >> unsigned integers? Are there any plans to add unsigned integer types? >> >> -- >> Robert Minsk >> Systems and Software Engineer >> >> WWW.METHODSTUDIOS.COM >> 730 Arizona Ave, Santa Monica, CA 90401 >> O:+1 310 434 6500 // F:+1 310 434 6501 >> >> -- Robert Minsk Systems and Software Engineer WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> 730 Arizona Ave, Santa Monica, CA 90401 O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 <tel:+13104346501> Los Angeles <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential information. If you are not the intended recipient of this e-mail, you are hereby notified any dissemination, distribution or copying of this email and any attachments is strictly prohibited. If you receive this email in error, please immediately notify the sender by return email and permanently delete the original, any copy and any printout thereof. The integrity and security of e-mail cannot be guaranteed. +
Robert Minsk 2012-12-21, 19:18
-
Re: unsigned integer typesDoug Cutting 2012-12-21, 19:42
That schema's not quite right.
A fixed schema looks like: {"name": "foo", "type": "fixed", "size": 8} A record field looks like: {"name": "foo", "type": <schema>} So a record with a fixed as a field would look like: { "type": "record", "name": "recordWithFixed", "fields" : [ {"name": "fixedValue", "type": {"name": "myFixed", "type": "fixed", "size": 8}} ] } Alternately you can perhaps just skip the record and use the fixed schema directly. Doug On Fri, Dec 21, 2012 at 11:18 AM, Robert Minsk < [EMAIL PROTECTED]> wrote: > Fixed does not seem to work. > > avro-1.7.3 > > fixed.json: > { > "type": "record", > "name": "fixed", > "fields" : [ > {"name": "foo", "type": "fixed", "size": 8} > ] > } > > ./avrogencpp -i fixed.json -o fixed.hh > Segmentation fault (core dumped) > > If I change the fixed.json to: > { > "type": "record", > "name": "test_fixed", > "fields" : [ > {"name": "foo", "type": "fixed", "size": 8} > ] > } > > ./avrogencpp -i fixed.json -o fixed.hh > Failed to parse or compile schema: Unknown type: fixed > > > On 12/21/2012 03:43 AM, Martin Kleppmann wrote: > > If your numbers are typically small, you can just use a signed type — > the sign bit's overhead is insignificant. > > If your numbers typically use the full range of 0 to 2^64-1 (or 0 to > 2^32-1), e.g. because they are hashes or random numbers from that > range, you're best off using the 'fixed' type and specifying the > number of bytes you want. In this case 'fixed' is more efficient than > the variable-length encoding of int/long, because there is no overhead > for indicating the length; it's simply stored as that number of bytes, > and nothing else. > > Because those two options cover most use cases, I don't think there > are any plans to add unsigned int support to Avro. > > Martin > > > On 20 December 2012 13:25, Robert Minsk <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > wrote: > > I am currently testing Avro for our network serialization for a mix of C++ > and python programs. I have noticed that Avro does not offer an unsigned > 32-bit or unsigned 64-bit integer types. How are people currently handling > unsigned integers? Are there any plans to add unsigned integer types? > > -- > Robert Minsk > Systems and Software Engineer > WWW.METHODSTUDIOS.COM > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 // F:+1 310 434 6501 > > > -- > Robert Minsk > Systems and Software Engineer > > WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 <+13104346500> // F:+1 310 434 6501 <+13104346501> > > [image: Los Angeles]<http://www.methodstudios.com/signature/url/los-angeles> > <http://www.methodstudios.com/signature/url/los-angeles> > > +
Doug Cutting 2012-12-21, 19:42
-
Re: unsigned integer typesRobert Minsk 2012-12-21, 19:50
So what is the second required name field for? In your example "myFixed".
On 12/21/2012 11:42 AM, Doug Cutting wrote: > That schema's not quite right. > > A fixed schema looks like: > > {"name": "foo", "type": "fixed", "size": 8} > > A record field looks like: > > {"name": "foo", "type": <schema>} > > So a record with a fixed as a field would look like: > > { > "type": "record", > "name": "recordWithFixed", > "fields" : [ > {"name": "fixedValue", "type": {"name": "myFixed", "type": "fixed", > "size": 8}} > ] > } > > Alternately you can perhaps just skip the record and use the fixed > schema directly. > > Doug > > > On Fri, Dec 21, 2012 at 11:18 AM, Robert Minsk > <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Fixed does not seem to work. > > avro-1.7.3 > > fixed.json: > { > "type": "record", > "name": "fixed", > "fields" : [ > {"name": "foo", "type": "fixed", "size": 8} > ] > } > > ./avrogencpp -i fixed.json -o fixed.hh > Segmentation fault (core dumped) > > If I change the fixed.json to: > { > "type": "record", > "name": "test_fixed", > "fields" : [ > {"name": "foo", "type": "fixed", "size": 8} > ] > } > > ./avrogencpp -i fixed.json -o fixed.hh > Failed to parse or compile schema: Unknown type: fixed > > > On 12/21/2012 03:43 AM, Martin Kleppmann wrote: >> If your numbers are typically small, you can just use a signed type — >> the sign bit's overhead is insignificant. >> >> If your numbers typically use the full range of 0 to 2^64-1 (or 0 to >> 2^32-1), e.g. because they are hashes or random numbers from that >> range, you're best off using the 'fixed' type and specifying the >> number of bytes you want. In this case 'fixed' is more efficient than >> the variable-length encoding of int/long, because there is no overhead >> for indicating the length; it's simply stored as that number of bytes, >> and nothing else. >> >> Because those two options cover most use cases, I don't think there >> are any plans to add unsigned int support to Avro. >> >> Martin >> >> >> On 20 December 2012 13:25, Robert Minsk<[EMAIL PROTECTED]> <mailto:[EMAIL PROTECTED]> >> wrote: >>> I am currently testing Avro for our network serialization for a mix of C++ >>> and python programs. I have noticed that Avro does not offer an unsigned >>> 32-bit or unsigned 64-bit integer types. How are people currently handling >>> unsigned integers? Are there any plans to add unsigned integer types? >>> >>> -- >>> Robert Minsk >>> Systems and Software Engineer >>> >>> WWW.METHODSTUDIOS.COM <http://WWW.METHODSTUDIOS.COM> >>> 730 Arizona Ave, Santa Monica, CA 90401 >>> O:+1 310 434 6500 <tel:%2B1%20310%20434%206500> // F:+1 310 434 6501 <tel:%2B1%20310%20434%206501> >>> >>> > > -- > Robert Minsk > Systems and Software Engineer > > WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 > <tel:+13104346501> > > Los Angeles > <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> > > > -- Robert Minsk Systems and Software Engineer WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> 730 Arizona Ave, Santa Monica, CA 90401 O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 <tel:+13104346501> Los Angeles <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential information. If you are not the intended recipient of this e-mail, you are hereby notified any dissemination, distribution or copying of this email and any attachments is strictly prohibited. If you receive this email in error, please immediately notify the sender by return email and permanently delete the original, any copy and any printout thereof. The integrity and security of e-mail cannot be guaranteed. +
Robert Minsk 2012-12-21, 19:50
-
Re: unsigned integer typesDoug Cutting 2012-12-21, 20:03
Fixed, like record and enum, is a named type. In Java, a separate class is
defined for each fixed type. So that is the name the fixed type as opposed to the fixed field within the record type. Does that make sense? Doug On Dec 21, 2012 11:50 AM, "Robert Minsk" <[EMAIL PROTECTED]> wrote: > So what is the second required name field for? In your example "myFixed". > > On 12/21/2012 11:42 AM, Doug Cutting wrote: > > That schema's not quite right. > > A fixed schema looks like: > > {"name": "foo", "type": "fixed", "size": 8} > > A record field looks like: > > {"name": "foo", "type": <schema>} > > So a record with a fixed as a field would look like: > > { > "type": "record", > "name": "recordWithFixed", > "fields" : [ > {"name": "fixedValue", "type": {"name": "myFixed", "type": "fixed", > "size": 8}} > ] > } > > Alternately you can perhaps just skip the record and use the fixed > schema directly. > > Doug > > > On Fri, Dec 21, 2012 at 11:18 AM, Robert Minsk < > [EMAIL PROTECTED]> wrote: > >> Fixed does not seem to work. >> >> avro-1.7.3 >> >> fixed.json: >> { >> "type": "record", >> "name": "fixed", >> "fields" : [ >> {"name": "foo", "type": "fixed", "size": 8} >> ] >> } >> >> ./avrogencpp -i fixed.json -o fixed.hh >> Segmentation fault (core dumped) >> >> If I change the fixed.json to: >> { >> "type": "record", >> "name": "test_fixed", >> "fields" : [ >> {"name": "foo", "type": "fixed", "size": 8} >> ] >> } >> >> ./avrogencpp -i fixed.json -o fixed.hh >> Failed to parse or compile schema: Unknown type: fixed >> >> >> On 12/21/2012 03:43 AM, Martin Kleppmann wrote: >> >> If your numbers are typically small, you can just use a signed type — >> the sign bit's overhead is insignificant. >> >> If your numbers typically use the full range of 0 to 2^64-1 (or 0 to >> 2^32-1), e.g. because they are hashes or random numbers from that >> range, you're best off using the 'fixed' type and specifying the >> number of bytes you want. In this case 'fixed' is more efficient than >> the variable-length encoding of int/long, because there is no overhead >> for indicating the length; it's simply stored as that number of bytes, >> and nothing else. >> >> Because those two options cover most use cases, I don't think there >> are any plans to add unsigned int support to Avro. >> >> Martin >> >> >> On 20 December 2012 13:25, Robert Minsk <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> >> wrote: >> >> I am currently testing Avro for our network serialization for a mix of C++ >> and python programs. I have noticed that Avro does not offer an unsigned >> 32-bit or unsigned 64-bit integer types. How are people currently handling >> unsigned integers? Are there any plans to add unsigned integer types? >> >> -- >> Robert Minsk >> Systems and Software Engineer >> WWW.METHODSTUDIOS.COM >> 730 Arizona Ave, Santa Monica, CA 90401 >> O:+1 310 434 6500 // F:+1 310 434 6501 >> >> >> -- >> Robert Minsk >> Systems and Software Engineer >> >> WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> >> 730 Arizona Ave, Santa Monica, CA 90401 >> O:+1 310 434 6500 <+13104346500> // F:+1 310 434 6501 <+13104346501> >> >> [image: Los Angeles]<http://www.methodstudios.com/signature/url/los-angeles> >> <http://www.methodstudios.com/signature/url/los-angeles> >> >> > > -- > Robert Minsk > Systems and Software Engineer > > WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 <+13104346500> // F:+1 310 434 6501 <+13104346501> > > [image: Los Angeles]<http://www.methodstudios.com/signature/url/los-angeles> > <http://www.methodstudios.com/signature/url/los-angeles> > > +
Doug Cutting 2012-12-21, 20:03
-
Re: unsigned integer typesRobert Minsk 2012-12-21, 19:59
I see what the second name is for. A bit confusing.
If I just have a union the second name is used on the set method. This so you can have multiple fixed value sizes in a union. On 12/21/2012 11:50 AM, Robert Minsk wrote: > So what is the second required name field for? In your example "myFixed". > > On 12/21/2012 11:42 AM, Doug Cutting wrote: >> That schema's not quite right. >> >> A fixed schema looks like: >> >> {"name": "foo", "type": "fixed", "size": 8} >> >> A record field looks like: >> >> {"name": "foo", "type": <schema>} >> >> So a record with a fixed as a field would look like: >> >> { >> "type": "record", >> "name": "recordWithFixed", >> "fields" : [ >> {"name": "fixedValue", "type": {"name": "myFixed", "type": "fixed", >> "size": 8}} >> ] >> } >> >> Alternately you can perhaps just skip the record and use the fixed >> schema directly. >> >> Doug >> >> >> On Fri, Dec 21, 2012 at 11:18 AM, Robert Minsk >> <[EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]>> wrote: >> >> Fixed does not seem to work. >> >> avro-1.7.3 >> >> fixed.json: >> { >> "type": "record", >> "name": "fixed", >> "fields" : [ >> {"name": "foo", "type": "fixed", "size": 8} >> ] >> } >> >> ./avrogencpp -i fixed.json -o fixed.hh >> Segmentation fault (core dumped) >> >> If I change the fixed.json to: >> { >> "type": "record", >> "name": "test_fixed", >> "fields" : [ >> {"name": "foo", "type": "fixed", "size": 8} >> ] >> } >> >> ./avrogencpp -i fixed.json -o fixed.hh >> Failed to parse or compile schema: Unknown type: fixed >> >> >> On 12/21/2012 03:43 AM, Martin Kleppmann wrote: >>> If your numbers are typically small, you can just use a signed type — >>> the sign bit's overhead is insignificant. >>> >>> If your numbers typically use the full range of 0 to 2^64-1 (or 0 to >>> 2^32-1), e.g. because they are hashes or random numbers from that >>> range, you're best off using the 'fixed' type and specifying the >>> number of bytes you want. In this case 'fixed' is more efficient than >>> the variable-length encoding of int/long, because there is no overhead >>> for indicating the length; it's simply stored as that number of bytes, >>> and nothing else. >>> >>> Because those two options cover most use cases, I don't think there >>> are any plans to add unsigned int support to Avro. >>> >>> Martin >>> >>> >>> On 20 December 2012 13:25, Robert Minsk<[EMAIL PROTECTED]> <mailto:[EMAIL PROTECTED]> >>> wrote: >>>> I am currently testing Avro for our network serialization for a mix of C++ >>>> and python programs. I have noticed that Avro does not offer an unsigned >>>> 32-bit or unsigned 64-bit integer types. How are people currently handling >>>> unsigned integers? Are there any plans to add unsigned integer types? >>>> >>>> -- >>>> Robert Minsk >>>> Systems and Software Engineer >>>> >>>> WWW.METHODSTUDIOS.COM <http://WWW.METHODSTUDIOS.COM> >>>> 730 Arizona Ave, Santa Monica, CA 90401 >>>> O:+1 310 434 6500 <tel:%2B1%20310%20434%206500> // F:+1 310 434 6501 <tel:%2B1%20310%20434%206501> >>>> >>>> >> >> -- >> Robert Minsk >> Systems and Software Engineer >> >> WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> >> 730 Arizona Ave, Santa Monica, CA 90401 >> O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 >> <tel:+13104346501> >> >> Los Angeles >> <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> >> >> >> > > -- > Robert Minsk > Systems and Software Engineer > > WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> > 730 Arizona Ave, Santa Monica, CA 90401 > O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 > <tel:+13104346501> > > Los Angeles Robert Minsk Systems and Software Engineer WWW.METHODSTUDIOS.COM <http://www.methodstudios.com> 730 Arizona Ave, Santa Monica, CA 90401 O:+1 310 434 6500 <tel:+13104346500> // F:+1 310 434 6501 <tel:+13104346501> Los Angeles <http://www.methodstudios.com/signature/url/los-angeles><http://www.methodstudios.com/signature/url/los-angeles> This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential information. If you are not the intended recipient of this e-mail, you are hereby notified any dissemination, distribution or copying of this email and any attachments is strictly prohibited. If you receive this email in error, please immediately notify the sender by return email and permanently delete the original, any copy and any printout thereof. The integrity and security of e-mail cannot be guaranteed. +
Robert Minsk 2012-12-21, 19:59
|