|
|
-
Re: Compile error using contrib.utils.join package with new mapreduce APIMahesh Balija 2013-01-13, 04:40
Hi Mike,
As I can see that DataJoinMapper/ReducerBase are implementing the Mapper and Reducer interfaces from the MapRed package. And as you are creating the job with latest API you are getting these compilation errors. You should search for the DataJoinMapper/ReducerBase are available in the latest API or not. Or else you should rewrite your job in old passion using jobconf. Best, Mahesh Balija, Calsoft Labs. On Fri, Jan 11, 2013 at 9:00 PM, Michael Forage < [EMAIL PROTECTED]> wrote: > Hi**** > > ** ** > > I’m using Hadoop 1.0.4 and using the hadoop.mapreduce API having problems > compiling a simple class to implement a reduce-side data join of 2 files.* > *** > > I’m trying to do this using contrib.utils.join and in Eclipse it all > compiles fine other than:**** > > ** ** > > job.*setMapperClass*(MapClass.*class*);**** > > job.*setReducerClass*(Reduce.*class*);**** > > ** ** > > …which both complain that the referenced class no longer extends either > Mapper<> or Reducer<>**** > > It’s my understanding that for what they should instead extend DataJoinMapperBase > and DataJoinReducerBase in order **** > > ** ** > > Have searched for a solution everywhere but unfortunately, all the > examples I can find are based on the deprecated mapred API.**** > > Assuming this package actually works with the new API, can anyone offer > any advice?**** > > ** ** > > Complete compile errors:**** > > ** ** > > The method setMapperClass(Class<? extends Mapper>) in the type Job is not > applicable for the arguments (Class<DataJoin.MapClass>)**** > > The method setReducerClass(Class<? extends Reducer>) in the type Job is > not applicable for the arguments (Class<DataJoin.Reduce>)**** > > ** ** > > …and the code…**** > > ** ** > > *package* JoinTest;**** > > ** ** > > *import* java.io.DataInput;**** > > *import* java.io.DataOutput;**** > > *import* java.io.IOException;**** > > *import* java.util.Iterator;**** > > ** ** > > *import* org.apache.hadoop.conf.Configuration;**** > > *import* org.apache.hadoop.conf.Configured;**** > > *import* org.apache.hadoop.fs.Path;**** > > *import* org.apache.hadoop.io.LongWritable;**** > > *import* org.apache.hadoop.io.Text;**** > > *import* org.apache.hadoop.io.Writable;**** > > *import* org.apache.hadoop.mapreduce.Job;**** > > *import* org.apache.hadoop.mapreduce.Mapper;**** > > *import* org.apache.hadoop.mapreduce.Reducer;**** > > *import* org.apache.hadoop.mapreduce.Mapper.Context;**** > > *import* org.apache.hadoop.mapreduce.lib.input.FileInputFormat;**** > > *import* org.apache.hadoop.mapreduce.lib.input.TextInputFormat;**** > > *import* org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;**** > > *import* org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;**** > > *import* org.apache.hadoop.util.Tool;**** > > *import* org.apache.hadoop.util.ToolRunner;**** > > ** ** > > *import* org.apache.hadoop.contrib.utils.join.DataJoinMapperBase;**** > > *import* org.apache.hadoop.contrib.utils.join.DataJoinReducerBase;**** > > *import* org.apache.hadoop.contrib.utils.join.TaggedMapOutput;**** > > ** ** > > *public* *class* DataJoin *extends* Configured *implements* Tool {**** > > **** > > *public* *static* *class* MapClass *extends* DataJoinMapperBase {*** > * > > **** > > *protected* Text generateInputTag(String inputFile) {**** > > String datasource = inputFile.split("-")[0];**** > > *return* *new* Text(datasource);**** > > }**** > > **** > > *protected* Text generateGroupKey(TaggedMapOutput aRecord) {**** > > String line = ((Text) aRecord.getData()).toString();**** > > String[] tokens = line.split(",");**** > > String groupKey = tokens[0];**** > > *return* *new* Text(groupKey);**** > > }**** > > **** > > *protected* TaggedMapOutput generateTaggedMapOutput(Object value) > {**** > > TaggedWritable retv = *new* TaggedWritable((Text) value);**** |