Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Maven dependency


Copy link to this message
-
Maven dependency
Kevin Burton 2013-04-24, 19:13
I am reading "Hadoop in Action" and the author on page 51 puts forth this
code:

 

public class WordCount2 {

public static void main(String[] args) {

   JobClient client = new JobClient();

   JobConf conf = new JobConf(WordCount2.class);

   FileInputFormat.addInputPath(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

   conf.setOutputKeyClass(Text.class);

   conf.setOutputValueClass(LongWritable.class);

   conf.setMapperClass(TokenCountMapper.class);

   conf.setCombinerClass(LongSumReducer.class);

   conf.setReducerClass(LongSumReducer.class);r

   client.setConf(conf);

   try {

       JobClient.runJob(conf);

   } catch (Exception e) {

       e.printStackTrace();

   }

       }

}

 

Which is an example for a simple MapReduce job. But being a beginner I am
not sure how to set up a project for this code. If I am using Maven what are
the Maven dependencies that I need? There are several map reduce
dependencies and I am not sure which to pick. Are there other dependencies
need (such as JobConf)? What are the imports needed? During the construction
of the configuration what heuristics are used to find the configuration for
the Hadoop cluster?

 

Thank you.