Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Re: Compile error using contrib.utils.join package with new mapreduce API


Copy link to this message
-
Re: Compile error using contrib.utils.join package with new mapreduce API
Hemanth Yamijala 2013-01-14, 05:07
Hi,

The datajoin package has a class called DataJoinJob (
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/contrib/utils/join/DataJoinJob.html
)

I think using this will help you get around the issue you are facing.

>From the source, this is the command line usage of the class:

usage: DataJoinJob inputdirs outputdir map_input_file_format  numofParts
mapper_class reducer_class map_output_value_class output_value_class
[maxNumOfValuesPerGroup [descriptionOfJob]]]

Internally the class uses the old API to set the mapper and reducer passed
as arguments above.

Thanks
hemanth
On Fri, Jan 11, 2013 at 9:00 PM, Michael Forage <
[EMAIL PROTECTED]> wrote:

>  Hi****
>
> ** **
>
> I’m using Hadoop 1.0.4 and using the hadoop.mapreduce API having problems
> compiling a simple class to implement a reduce-side data join of 2 files.*
> ***
>
> I’m trying to do this using contrib.utils.join and in Eclipse it all
> compiles fine other than:****
>
> ** **
>
> job.*setMapperClass*(MapClass.*class*);****
>
>       job.*setReducerClass*(Reduce.*class*);****
>
> ** **
>
> …which both complain that the referenced class no longer extends either
> Mapper<> or Reducer<>****
>
> It’s my understanding that for what they should instead extend DataJoinMapperBase
> and DataJoinReducerBase in order ****
>
> ** **
>
> Have searched for a solution everywhere  but unfortunately, all the
> examples I can find are based on the deprecated mapred API.****
>
> Assuming this package actually works with the new API, can anyone offer
> any advice?****
>
> ** **
>
> Complete compile errors:****
>
> ** **
>
> The method setMapperClass(Class<? extends Mapper>) in the type Job is not
> applicable for the arguments (Class<DataJoin.MapClass>)****
>
> The method setReducerClass(Class<? extends Reducer>) in the type Job is
> not applicable for the arguments (Class<DataJoin.Reduce>)****
>
> ** **
>
> …and the code…****
>
> ** **
>
> *package* JoinTest;****
>
> ** **
>
> *import* java.io.DataInput;****
>
> *import* java.io.DataOutput;****
>
> *import* java.io.IOException;****
>
> *import* java.util.Iterator;****
>
> ** **
>
> *import* org.apache.hadoop.conf.Configuration;****
>
> *import* org.apache.hadoop.conf.Configured;****
>
> *import* org.apache.hadoop.fs.Path;****
>
> *import* org.apache.hadoop.io.LongWritable;****
>
> *import* org.apache.hadoop.io.Text;****
>
> *import* org.apache.hadoop.io.Writable;****
>
> *import* org.apache.hadoop.mapreduce.Job;****
>
> *import* org.apache.hadoop.mapreduce.Mapper;****
>
> *import* org.apache.hadoop.mapreduce.Reducer;****
>
> *import* org.apache.hadoop.mapreduce.Mapper.Context;****
>
> *import* org.apache.hadoop.mapreduce.lib.input.FileInputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.input.TextInputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;****
>
> *import* org.apache.hadoop.util.Tool;****
>
> *import* org.apache.hadoop.util.ToolRunner;****
>
> ** **
>
> *import* org.apache.hadoop.contrib.utils.join.DataJoinMapperBase;****
>
> *import* org.apache.hadoop.contrib.utils.join.DataJoinReducerBase;****
>
> *import* org.apache.hadoop.contrib.utils.join.TaggedMapOutput;****
>
> ** **
>
> *public* *class* DataJoin *extends* Configured *implements* Tool {****
>
>     ****
>
>       *public* *static* *class* MapClass *extends* DataJoinMapperBase {***
> *
>
>         ****
>
>         *protected* Text generateInputTag(String inputFile) {****
>
>             String datasource = inputFile.split("-")[0];****
>
>             *return* *new* Text(datasource);****
>
>         }****
>
>         ****
>
>         *protected* Text generateGroupKey(TaggedMapOutput aRecord) {****
>
>             String line = ((Text) aRecord.getData()).toString();****
>
>             String[] tokens = line.split(",");****
>
>             String groupKey = tokens[0];****
>
>             *return* *new* Text(groupKey);****
>
>         }****
+
Hemanth Yamijala 2013-01-14, 14:15