|
|
+
Jane Wayne 2012-03-20, 06:47
+
Chris White 2012-03-20, 10:30
-
Re: is implementing WritableComparable and setting Job.setSortComparatorClass(...) redundant?Jane Wayne 2012-03-20, 15:57
thanks chris!
On Tue, Mar 20, 2012 at 6:30 AM, Chris White <[EMAIL PROTECTED]>wrote: > Setting sortComparatorClass will allow you to configure a > RawComparator implementation (allowing you to do more efficient > comparisons at the byte level). If you don't set it then hadoop uses > the WritableComparator by default. This implementation deserializes > the bytes into instances using your readFields method and then calls > compareTo to determine key ordering. (look at the source in > org.apache.hadoop.io.WritableComparator.compare(byte[], int, int, > byte[], int, int)) > > So if you don't want to be as efficient as possible, then delegating > to WritableComparator is probably fine. > > Note that you can also configure a RawComparator for your key class > using a static block to register it with WritableComparator, look at > the source for Text for an example of this: > > /** A WritableComparator optimized for Text keys. */ > public static class Comparator extends WritableComparator { > public Comparator() { > super(Text.class); > } > > public int compare(byte[] b1, int s1, int l1, > byte[] b2, int s2, int l2) { > int n1 = WritableUtils.decodeVIntSize(b1[s1]); > int n2 = WritableUtils.decodeVIntSize(b2[s2]); > return compareBytes(b1, s1+n1, l1-n1, b2, s2+n2, l2-n2); > } > } > > static { > // register this comparator > WritableComparator.define(Text.class, new Comparator()); > } > > Chris > > On Tue, Mar 20, 2012 at 2:47 AM, Jane Wayne <[EMAIL PROTECTED]> > wrote: > > quick question: > > > > i have a key that already implements WritableComparable. this will be the > > intermediary key passed from the map to the reducer. > > > > is it necessary to extend RawComparator and set it on > > Job.setSortComparatorClass(Class<? extends RawComparator> cls) ? > |