Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Compile error using contrib.utils.join package with new mapreduce API


Copy link to this message
-
Re: Compile error using contrib.utils.join package with new mapreduce API
Hi,

The datajoin package has a class called DataJoinJob (
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/contrib/utils/join/DataJoinJob.html
)

I think using this will help you get around the issue you are facing.

>From the source, this is the command line usage of the class:

usage: DataJoinJob inputdirs outputdir map_input_file_format  numofParts
mapper_class reducer_class map_output_value_class output_value_class
[maxNumOfValuesPerGroup [descriptionOfJob]]]

Internally the class uses the old API to set the mapper and reducer passed
as arguments above.

Thanks
hemanth
On Fri, Jan 11, 2013 at 9:00 PM, Michael Forage <
[EMAIL PROTECTED]> wrote:

>  Hi****
>
> ** **
>
> I’m using Hadoop 1.0.4 and using the hadoop.mapreduce API having problems
> compiling a simple class to implement a reduce-side data join of 2 files.*
> ***
>
> I’m trying to do this using contrib.utils.join and in Eclipse it all
> compiles fine other than:****
>
> ** **
>
> job.*setMapperClass*(MapClass.*class*);****
>
>       job.*setReducerClass*(Reduce.*class*);****
>
> ** **
>
> …which both complain that the referenced class no longer extends either
> Mapper<> or Reducer<>****
>
> It’s my understanding that for what they should instead extend DataJoinMapperBase
> and DataJoinReducerBase in order ****
>
> ** **
>
> Have searched for a solution everywhere  but unfortunately, all the
> examples I can find are based on the deprecated mapred API.****
>
> Assuming this package actually works with the new API, can anyone offer
> any advice?****
>
> ** **
>
> Complete compile errors:****
>
> ** **
>
> The method setMapperClass(Class<? extends Mapper>) in the type Job is not
> applicable for the arguments (Class<DataJoin.MapClass>)****
>
> The method setReducerClass(Class<? extends Reducer>) in the type Job is
> not applicable for the arguments (Class<DataJoin.Reduce>)****
>
> ** **
>
> …and the code…****
>
> ** **
>
> *package* JoinTest;****
>
> ** **
>
> *import* java.io.DataInput;****
>
> *import* java.io.DataOutput;****
>
> *import* java.io.IOException;****
>
> *import* java.util.Iterator;****
>
> ** **
>
> *import* org.apache.hadoop.conf.Configuration;****
>
> *import* org.apache.hadoop.conf.Configured;****
>
> *import* org.apache.hadoop.fs.Path;****
>
> *import* org.apache.hadoop.io.LongWritable;****
>
> *import* org.apache.hadoop.io.Text;****
>
> *import* org.apache.hadoop.io.Writable;****
>
> *import* org.apache.hadoop.mapreduce.Job;****
>
> *import* org.apache.hadoop.mapreduce.Mapper;****
>
> *import* org.apache.hadoop.mapreduce.Reducer;****
>
> *import* org.apache.hadoop.mapreduce.Mapper.Context;****
>
> *import* org.apache.hadoop.mapreduce.lib.input.FileInputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.input.TextInputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;****
>
> *import* org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;****
>
> *import* org.apache.hadoop.util.Tool;****
>
> *import* org.apache.hadoop.util.ToolRunner;****
>
> ** **
>
> *import* org.apache.hadoop.contrib.utils.join.DataJoinMapperBase;****
>
> *import* org.apache.hadoop.contrib.utils.join.DataJoinReducerBase;****
>
> *import* org.apache.hadoop.contrib.utils.join.TaggedMapOutput;****
>
> ** **
>
> *public* *class* DataJoin *extends* Configured *implements* Tool {****
>
>     ****
>
>       *public* *static* *class* MapClass *extends* DataJoinMapperBase {***
> *
>
>         ****
>
>         *protected* Text generateInputTag(String inputFile) {****
>
>             String datasource = inputFile.split("-")[0];****
>
>             *return* *new* Text(datasource);****
>
>         }****
>
>         ****
>
>         *protected* Text generateGroupKey(TaggedMapOutput aRecord) {****
>
>             String line = ((Text) aRecord.getData()).toString();****
>
>             String[] tokens = line.split(",");****
>
>             String groupKey = tokens[0];****
>
>             *return* *new* Text(groupKey);****
>
>         }****
+
Hemanth Yamijala 2013-01-14, 14:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB