|
|
-
Re: WordPairCount Mapreduce question.Mahesh Balija 2013-02-23, 13:23
Please check the in-line answers...
On Sat, Feb 23, 2013 at 6:22 PM, Sai Sai <[EMAIL PROTECTED]> wrote: > > Hello > > I have a question about how Mapreduce sorting works internally with > multiple columns. > > Below r my classes using 2 columns in an input file given below. > > 1st question: About the method hashCode, we r adding a "31 + ", i am > wondering why is this required. what does 31 refer to. > This is how usually hashcode is calculated for any String instance (s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]) where n stands for length of the String. Since in your case you only have 2 chars then it will be a * 31^0 + b * 31^1. > > 2nd question: what if my input file has 3 columns instead of 2 how would > you write a compare method and was wondering if anyone can map this to a > real world scenario it will be really helpful. > you will extend the same approach for the third column, public int compareTo(WordPairCountKey o) { int diff = word1.compareTo(o.word1); if (diff == 0) { diff = word2.compareTo(o.word2); if(diff==0){ diff = word3.compareTo(o.word3); } } return diff; } > > > @Override > public int compareTo(WordPairCountKey o) { > int diff = word1.compareTo(o.word1); > if (diff == 0) { > diff = word2.compareTo(o.word2); > } > return diff; > } > > @Override > public int hashCode() { > return word1.hashCode() + 31 * word2.hashCode(); > } > > ****************************** > > Here is my input file wordpair.txt > > ****************************** > > a b > a c > a b > a d > b d > e f > b d > e f > b d > > ********************************** > > Here is my WordPairObject: > > ********************************* > > public class WordPairCountKey implements > WritableComparable<WordPairCountKey> { > > private String word1; > private String word2; > > @Override > public int compareTo(WordPairCountKey o) { > int diff = word1.compareTo(o.word1); > if (diff == 0) { > diff = word2.compareTo(o.word2); > } > return diff; > } > > @Override > public int hashCode() { > return word1.hashCode() + 31 * word2.hashCode(); > } > > > public String getWord1() { > return word1; > } > > public void setWord1(String word1) { > this.word1 = word1; > } > > public String getWord2() { > return word2; > } > > public void setWord2(String word2) { > this.word2 = word2; > } > > @Override > public void readFields(DataInput in) throws IOException { > word1 = in.readUTF(); > word2 = in.readUTF(); > } > > @Override > public void write(DataOutput out) throws IOException { > out.writeUTF(word1); > out.writeUTF(word2); > } > > > @Override > public String toString() { > return "[word1=" + word1 + ", word2=" + word2 + "]"; > } > > } > > ****************************** > > Any help will be really appreciated. > Thanks > Sai > |