Jonathan Coveney 2012-05-22, 20:22
Russell Jurney 2012-05-22, 21:43
-Re: Can serialized Avro records be efficiently compared without deserializing?
On Tue, May 22, 2012 at 1:22 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
> Imagine I use Avro to serialize an object (without loss of generality let's
> say an array of longs). I'm curious if it is possible to compare those
> arrays without deserializing... ie look at the bytes in memory or on disk,
> and do the comparison based on those bytes (ie the raw comparison that
> Hadoop does in the shuffle sort).
> I poked around the documentation but wasn't sure where to look.
Yes, this is possible.
The Java method that does this is BinaryData#compare().
int, byte, int, org.apache.avro.Schema)
Jon Coveney 2012-05-24, 06:13