Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> unsigned 32bit (uint) in Avro - C# ?


Copy link to this message
-
Re: unsigned 32bit (uint) in Avro - C# ?

On Feb 12, 2014, at 4:04 PM, Sid Shetye <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

2. unsigned 32/64bit values have been extensively used as primitive types for over 3 decades (i.e. it's held it's ground. Heck, even core Java devs hate that unsigned doesn't exist. eg http://stackoverflow.com/questions/430346/why-doesnt-java-support-unsigned-ints)
3. All other workarounds simply add more friction to development when in reality, working with a primitive data type that's been around "forever" should be very transparent and very fluid.

It does add some friction — but — aren’t we in a space where the lowest common denominator has to be supported?
As you’re pointing out, #2 and #3 are about Java.

We’re not hitting an issue of serialization, afaik; if you’re looking for signed 32 bit, we’ve got that.
If you want unsigned, it seems to me that fixed is just fine for storage.

Do we agree that serialization is not the issue?
Issue I’m seeing here is with actual schema expressivity as well as APIs in other languages.

That’s an area where I’m simply taking this section to heart:
"Attributes not defined in this document are permitted as metadata, but must not affect the format of serialized data.”

And that section to me screams out for a registry.
With a registry of attributes we could work around issues like this and still keep in sync with each other.

Heck, that “MD5” example in the manual is a great one:

{"type": "fixed", "size": 16, "name": "md5"}

We all know that means md5 — but it’s just untyped in Avro. A registry for things like “md5”, “uint32”, etc, would be a nice to have.
Then our silly selves can go ahead and implement more complex API/deserializers.
-Charles

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB