Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Re: BigInt / longlong


+
Tatu Saloranta 2012-03-29, 16:54
+
Scott Carey 2012-03-28, 18:43
Copy link to this message
-
Re: BigInt / longlong
I would encode to string. Should be simple enough, just means you need
a pass on the data after reading it.

On Wed, Mar 28, 2012 at 11:43 AM, Scott Carey <[EMAIL PROTECTED]> wrote:
> On 3/28/12 11:01 AM, "Meyer, Dennis" <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> What type refers to an Java Bigint or C long long? Or is there any other
> type in Avro that maps a 64 bit unsigned int?
>
> I unfortunately could only find smaller types in the docs:
>
> Primitive Types
>
> The set of primitive type names is:
>
> string: unicode character sequence
> bytes: sequence of 8-bit bytes
> int: 32-bit signed integer
> long: 64-bit signed integer
> float: single precision (32-bit) IEEE 754 floating-point number
> double: double precision (64-bit) IEEE 754 floating-point number
> boolean: a binary value
> null: no value
>
>
> Anyway in the encoding section theres some 64bit unsigned. Can I use them
> somehow by a type?
>
>
> An unsigned value fits in a signed one.  They are both 64 bits.  Each
> language that supports a long unsigned type has its own way to convert from
> one to the other without loss of data.
>
> Work around might be to use the 52 significant bits of a double, but seems
> like a hack and of course loosing some more number space compared to uint64.
> I'd like to get around any other self-encoding hacks as I'd like to also use
> Hadoop/PIG/HIVE on top on AVRO, so would like to keep functionality on
> numbers if possible.
>
>
> Java does not have an unsigned 64 bit type.  Hadoop/Pig/Hive all only have
> signed 64 bit integer quantities.
>
> Luckily, multiplication and addition on two's compliment signed values is
> identical to the operations on unsigned ints, so for many operations there
> is no loss in fidelity as long as you pass the raw bits on to something that
> interprets the number as an unsigned quantity.
>
> That is, if you take the raw bits of a set of unsigned 64 bit numbers, and
> treat those bits as if they are a signed 64 bit quantities, then do
> addition, subtraction, and multiplication on them, then treat the raw bit
> result as an unsigned 64 bit value, it is as if you did the whole thing
> unsigned.
>
> http://en.wikipedia.org/wiki/Two%27s_complement
>
> Avro only has signed 32 and 64 bit integer quantities because they can be
> mapped to unsigned ones in most cases without a problem and many (actually,
> most) languages do not support unsigned integers.
>
> If you want various precision quantities you can use an Avro Fixed type to
> map to any type you choose.  For example you can use a 16 byte fixed to map
> to 128 bit unsigned ints.
>
>
> Thanks,
> Dennis
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB