|
|
-
Re: Compile error using contrib.utils.join package with new mapreduce APIHemanth Yamijala 2013-01-15, 17:29
On the dev mailing list, Harsh pointed out that there is another join
related package: http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/join/ This seems to be available in 2.x and trunk. Could you check if this provides functionality you require - so we at least know there is new API support in later versions ? Thanks Hemanth On Mon, Jan 14, 2013 at 7:45 PM, Hemanth Yamijala <[EMAIL PROTECTED] > wrote: > Hi, > > No. I didn't find any reference to a working sample. I also didn't find > any JIRA that asks for a migration of this package to the new API. Not sure > why. I have asked on the dev list. > > Thanks > hemanth > > > On Mon, Jan 14, 2013 at 6:25 PM, Michael Forage < > [EMAIL PROTECTED]> wrote: > >> Thanks Hemanth**** >> >> ** ** >> >> I appreciate your response**** >> >> Did you find any working example of it in use? It looks to me like I’d >> still be tied to the old API**** >> >> Thanks**** >> >> Mike**** >> >> ** ** >> >> *From:* Hemanth Yamijala [mailto:[EMAIL PROTECTED]] >> *Sent:* 14 January 2013 05:08 >> *To:* [EMAIL PROTECTED] >> *Subject:* Re: Compile error using contrib.utils.join package with new >> mapreduce API**** >> >> ** ** >> >> Hi,**** >> >> ** ** >> >> The datajoin package has a class called DataJoinJob ( >> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/contrib/utils/join/DataJoinJob.html >> )**** >> >> ** ** >> >> I think using this will help you get around the issue you are facing.**** >> >> ** ** >> >> From the source, this is the command line usage of the class:**** >> >> ** ** >> >> usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts >> mapper_class reducer_class map_output_value_class output_value_class >> [maxNumOfValuesPerGroup [descriptionOfJob]]]**** >> >> ** ** >> >> Internally the class uses the old API to set the mapper and reducer >> passed as arguments above.**** >> >> ** ** >> >> Thanks**** >> >> hemanth**** >> >> ** ** >> >> ** ** >> >> ** ** >> >> On Fri, Jan 11, 2013 at 9:00 PM, Michael Forage < >> [EMAIL PROTECTED]> wrote:**** >> >> Hi**** >> >> **** >> >> I’m using Hadoop 1.0.4 and using the hadoop.mapreduce API having problems >> compiling a simple class to implement a reduce-side data join of 2 files. >> **** >> >> I’m trying to do this using contrib.utils.join and in Eclipse it all >> compiles fine other than:**** >> >> **** >> >> job.*setMapperClass*(MapClass.*class*);**** >> >> job.*setReducerClass*(Reduce.*class*);**** >> >> **** >> >> …which both complain that the referenced class no longer extends either >> Mapper<> or Reducer<>**** >> >> It’s my understanding that for what they should instead extend DataJoinMapperBase >> and DataJoinReducerBase in order **** >> >> **** >> >> Have searched for a solution everywhere but unfortunately, all the >> examples I can find are based on the deprecated mapred API.**** >> >> Assuming this package actually works with the new API, can anyone offer >> any advice?**** >> >> **** >> >> Complete compile errors:**** >> >> **** >> >> The method setMapperClass(Class<? extends Mapper>) in the type Job is not >> applicable for the arguments (Class<DataJoin.MapClass>)**** >> >> The method setReducerClass(Class<? extends Reducer>) in the type Job is >> not applicable for the arguments (Class<DataJoin.Reduce>)**** >> >> **** >> >> …and the code…**** >> >> **** >> >> *package* JoinTest;**** >> >> **** >> >> *import* java.io.DataInput;**** >> >> *import* java.io.DataOutput;**** >> >> *import* java.io.IOException;**** >> >> *import* java.util.Iterator;**** >> >> **** >> >> *import* org.apache.hadoop.conf.Configuration;**** >> >> *import* org.apache.hadoop.conf.Configured;**** >> >> *import* org.apache.hadoop.fs.Path;**** >> >> *import* org.apache.hadoop.io.LongWritable;**** >> >> *import* org.apache.hadoop.io.Text;**** >> >> *import* org.apache.hadoop.io.Writable;**** |