Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Re: BigInt / longlong


+
Tatu Saloranta 2012-03-29, 16:54
+
Scott Carey 2012-03-28, 18:43
Copy link to this message
-
Re: BigInt / longlong
Miki Tebeka 2012-03-28, 23:38
I would encode to string. Should be simple enough, just means you need
a pass on the data after reading it.

On Wed, Mar 28, 2012 at 11:43 AM, Scott Carey <[EMAIL PROTECTED]> wrote:
> On 3/28/12 11:01 AM, "Meyer, Dennis" <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> What type refers to an Java Bigint or C long long? Or is there any other
> type in Avro that maps a 64 bit unsigned int?
>
> I unfortunately could only find smaller types in the docs:
>
> Primitive Types
>
> The set of primitive type names is:
>
> string: unicode character sequence
> bytes: sequence of 8-bit bytes
> int: 32-bit signed integer
> long: 64-bit signed integer
> float: single precision (32-bit) IEEE 754 floating-point number
> double: double precision (64-bit) IEEE 754 floating-point number
> boolean: a binary value
> null: no value
>
>
> Anyway in the encoding section theres some 64bit unsigned. Can I use them
> somehow by a type?
>
>
> An unsigned value fits in a signed one.  They are both 64 bits.  Each
> language that supports a long unsigned type has its own way to convert from
> one to the other without loss of data.
>
> Work around might be to use the 52 significant bits of a double, but seems
> like a hack and of course loosing some more number space compared to uint64.
> I'd like to get around any other self-encoding hacks as I'd like to also use
> Hadoop/PIG/HIVE on top on AVRO, so would like to keep functionality on
> numbers if possible.
>
>
> Java does not have an unsigned 64 bit type.  Hadoop/Pig/Hive all only have
> signed 64 bit integer quantities.
>
> Luckily, multiplication and addition on two's compliment signed values is
> identical to the operations on unsigned ints, so for many operations there
> is no loss in fidelity as long as you pass the raw bits on to something that
> interprets the number as an unsigned quantity.
>
> That is, if you take the raw bits of a set of unsigned 64 bit numbers, and
> treat those bits as if they are a signed 64 bit quantities, then do
> addition, subtraction, and multiplication on them, then treat the raw bit
> result as an unsigned 64 bit value, it is as if you did the whole thing
> unsigned.
>
> http://en.wikipedia.org/wiki/Two%27s_complement
>
> Avro only has signed 32 and 64 bit integer quantities because they can be
> mapped to unsigned ones in most cases without a problem and many (actually,
> most) languages do not support unsigned integers.
>
> If you want various precision quantities you can use an Avro Fixed type to
> map to any type you choose.  For example you can use a 16 byte fixed to map
> to 128 bit unsigned ints.
>
>
> Thanks,
> Dennis