

Re: BigInt / longlong
On 3/28/12 11:01 AM, "Meyer, Dennis" <[EMAIL PROTECTED]> wrote:
> Hi, > > What type refers to an Java Bigint or C long long? Or is there any other type > in Avro that maps a 64 bit unsigned int? > > I unfortunately could only find smaller types in the docs: > Primitive Types > The set of primitive type names is: > * string: unicode character sequence > * bytes: sequence of 8bit bytes > * int: 32bit signed integer > * long: 64bit signed integer > * float: single precision (32bit) IEEE 754 floatingpoint number > * double: double precision (64bit) IEEE 754 floatingpoint number > * boolean: a binary value > * null: no value > > Anyway in the encoding section theres some 64bit unsigned. Can I use them > somehow by a type? An unsigned value fits in a signed one. They are both 64 bits. Each language that supports a long unsigned type has its own way to convert from one to the other without loss of data. > Work around might be to use the 52 significant bits of a double, but seems > like a hack and of course loosing some more number space compared to uint64. > I'd like to get around any other selfencoding hacks as I'd like to also use > Hadoop/PIG/HIVE on top on AVRO, so would like to keep functionality on numbers > if possible. Java does not have an unsigned 64 bit type. Hadoop/Pig/Hive all only have signed 64 bit integer quantities. Luckily, multiplication and addition on two's compliment signed values is identical to the operations on unsigned ints, so for many operations there is no loss in fidelity as long as you pass the raw bits on to something that interprets the number as an unsigned quantity. That is, if you take the raw bits of a set of unsigned 64 bit numbers, and treat those bits as if they are a signed 64 bit quantities, then do addition, subtraction, and multiplication on them, then treat the raw bit result as an unsigned 64 bit value, it is as if you did the whole thing unsigned. http://en.wikipedia.org/wiki/Two%27s_complement Avro only has signed 32 and 64 bit integer quantities because they can be mapped to unsigned ones in most cases without a problem and many (actually, most) languages do not support unsigned integers. If you want various precision quantities you can use an Avro Fixed type to map to any type you choose. For example you can use a 16 byte fixed to map to 128 bit unsigned ints. > > Thanks, > Dennis 
