Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Maven dependency


Copy link to this message
-
Maven dependency
I am reading "Hadoop in Action" and the author on page 51 puts forth this
code:

 

public class WordCount2 {

public static void main(String[] args) {

   JobClient client = new JobClient();

   JobConf conf = new JobConf(WordCount2.class);

   FileInputFormat.addInputPath(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

   conf.setOutputKeyClass(Text.class);

   conf.setOutputValueClass(LongWritable.class);

   conf.setMapperClass(TokenCountMapper.class);

   conf.setCombinerClass(LongSumReducer.class);

   conf.setReducerClass(LongSumReducer.class);r

   client.setConf(conf);

   try {

       JobClient.runJob(conf);

   } catch (Exception e) {

       e.printStackTrace();

   }

       }

}

 

Which is an example for a simple MapReduce job. But being a beginner I am
not sure how to set up a project for this code. If I am using Maven what are
the Maven dependencies that I need? There are several map reduce
dependencies and I am not sure which to pick. Are there other dependencies
need (such as JobConf)? What are the imports needed? During the construction
of the configuration what heuristics are used to find the configuration for
the Hadoop cluster?

 

Thank you.

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB