Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: WordPairCount Mapreduce question.


Copy link to this message
-
Re: WordPairCount Mapreduce question.


Hello

I have a question about how Mapreduce sorting works internally with multiple columns.

Below r my classes using 2 columns in an input file given below.
1st question: About the method hashCode, we r adding a "31 + ", i am wondering why is this required. what does 31 refer to.
2nd question: what if my input file has 3 columns instead of 2 how would you write a compare method and was wondering if anyone can map this to a real world scenario it will be really helpful.

    @Override
    public int compareTo(WordPairCountKey o) {
        int diff = word1.compareTo(o.word1);
        if (diff == 0) {
            diff = word2.compareTo(o.word2);
        }
        return diff;
    }
   
    @Override
    public int hashCode() {
        return word1.hashCode() + 31 * word2.hashCode();
    }

******************************

Here is my input file wordpair.txt

******************************

a    b
a    c
a    b
a    d
b    d
e    f
b    d
e    f
b    d

**********************************
Here is my WordPairObject:

*********************************

public class WordPairCountKey implements WritableComparable<WordPairCountKey> {

    private String word1;
    private String word2;

    @Override
    public int compareTo(WordPairCountKey o) {
        int diff = word1.compareTo(o.word1);
        if (diff == 0) {
            diff = word2.compareTo(o.word2);
        }
        return diff;
    }
   
    @Override
    public int hashCode() {
        return word1.hashCode() + 31 * word2.hashCode();
    }

   
    public String getWord1() {
        return word1;
    }

    public void setWord1(String word1) {
        this.word1 = word1;
    }

    public String getWord2() {
        return word2;
    }

    public void setWord2(String word2) {
        this.word2 = word2;
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        word1 = in.readUTF();
        word2 = in.readUTF();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeUTF(word1);
        out.writeUTF(word2);
    }

   
    @Override
    public String toString() {
        return "[word1=" + word1 + ", word2=" + word2 + "]";
    }

}

******************************

Any help will be really appreciated.
Thanks
Sai
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB