Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # dev - unsigned types


Copy link to this message
-
Re: unsigned types
Martin Kleppmann 2013-12-11, 17:16
Personally, I think it's a good design decision that Avro doesn't support
unsigned types.

Whether you use signed or unsigned only makes a difference if you expect to
have numbers between 2^63 and 2^64-1 (if you have numbers between 2^31 and
2^32-1 you can use the Avro 'long' type instead of the 'int' type). And if
your numbers are indeed between 2^63 and 2^64-1, you're better off using a
'fixed' type, which will only use 8 bytes, rather than a 'long' which would
use 10 bytes for such a large number, due to the variable-length encoding.

Another problem with unsigned types can be seen in Protocol Buffers (which
supports both signed and unsigned): if you do accidentally put -1 in a
field with an unsigned type, the resulting encoding is ten bytes long — a
surprising and unnecessary gotcha. (
https://developers.google.com/protocol-buffers/docs/encoding#types)

Interested to hear other opinions on the matter!

Martin
On 11 December 2013 12:38, Pedro Larroy <[EMAIL PROTECTED]>wrote:

> Hi
>
> Is there any reason except the java centric focus of avro that it shouldn't
> support unsigned types? We use them extensively and I'm thinking for us* it
> would be useful to have them as we use mostly C++ <-> python communication
> with avro.
>
> Would this be accepted in the official avro distribution?
>
> Pedro.
>
>
> *us: Here, a Nokia business.
>