Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Can serialized Avro records be efficiently compared without deserializing?


Copy link to this message
-
Re: Can serialized Avro records be efficiently compared without deserializing?
On Tue, May 22, 2012 at 1:22 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
> Imagine I use Avro to serialize an object (without loss of generality let's
> say an array of longs). I'm curious if it is possible to compare those
> arrays without deserializing... ie look at the bytes in memory or on disk,
> and do the comparison based on those bytes (ie the raw comparison that
> Hadoop does in the shuffle sort).
>
> I poked around the documentation but wasn't sure where to look.

Yes, this is possible.

The Java method that does this is BinaryData#compare().

http://avro.apache.org/docs/current/api/java/org/apache/avro/io/BinaryData.html#compare(byte[],
int, byte[], int, org.apache.avro.Schema)

Doug
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB