Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Need help with exception when mapper emits different key class from reducer


Copy link to this message
-
Re: Need help with exception when mapper emits different key class from reducer
Steve Lewis 2010-06-19, 17:01
Wow -  I cannot tell you how much I thank you - I totally missed the fact
that the exception is thrown in the combiner since I was seeing the
exception in the reducer - I always thought the combiner was called between
the mapper and the reducer and not after
the reducer -
Also does this mean I should use null as a combiner or use a very
generic combiner - especially for my real problem when there is no real
combiner step

On Fri, Jun 18, 2010 at 2:45 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:

> This took me a full read through to figure out. The problem is that
> you're using your reducer as a combiner and when it runs, the output
> of the map stage then becomes the wrong type.
>
> In pseudo-visual-speak:
>
> Object, Int -> Map() -> MyText, Int -> Combine() -> YourText, Int ->
> EXCEPTION!
>
> When using your reducer as a combiner, the reducer outputs *must*
> match the map outputs. In other words, your combiner - which is
> *optional* in the chain at Hadoop's pleasure - is changing the key
> space. That's a no-no. In your case, you can't reuse your reducer as a
> combiner.
>
> (The hint is in the exception: it's occurring in the combiner classes
> in Hadoop.)
>
> Hope that helps.
>
> On Fri, Jun 18, 2010 at 2:09 PM, Steve Lewis <[EMAIL PROTECTED]>
> wrote:
> >
> > This class is a copy of a standard WordCount class with one critical
> > exception
> > Instead of the Mapper Emitting a Key of Type Text it emits a key of type
> > MyText - s simple subclass of Text
> > The reducer emits a different subclass of Text - YourText
> > I say
> >         job.setMapOutputKeyClass(MyText.class);
> >         job.setMapOutputValueClass(IntWritable.class);
> >         job.setOutputKeyClass(YourText.class);
> >         job.setOutputValueClass(IntWritable.class);
> > which should declare these classes directly  and yet I get the following
> > exception using hadoop 0.2 on a local box
> > What am I doing wrong
> >
> > java.io.IOException: wrong key class: class
> > org.systemsbiology.hadoop.CapitalWordCount$YourText is not class
> > org.systemsbiology.hadoop.CapitalWordCount$MyText
> > at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:164)
> > at
> >
> org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880)
> > at
> >
> org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1201)
> > at
> >
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> > at
> >
> org.systemsbiology.hadoop.CapitalWordCount$IntSumReducer.reduce(CapitalWordCount.java:89)
> >
> > package org.systemsbiology.hadoop;
> > import com.lordjoe.utilities.*;
> > import org.apache.hadoop.conf.*;
> > import org.apache.hadoop.fs.*;
> > import org.apache.hadoop.io.*;
> > import org.apache.hadoop.mapreduce.*;
> > import org.apache.hadoop.mapreduce.lib.input.*;
> > import org.apache.hadoop.mapreduce.lib.output.*;
> > import org.apache.hadoop.util.*;
> > import java.io.*;
> > import java.util.*;
> > /**
> >  *  org.systemsbiology.hadoop.CapitalWordCount
> >  */
> > public class CapitalWordCount {
> >     public static class YourText extends Text
> >       {
> >           public YourText() {
> >           }
> >           /**
> >            * Construct from a string.
> >            */
> >           public YourText(final String string) {
> >               super(string);
> >           }
> >       }
> >     public static class MyText extends Text
> >     {
> >         public MyText() {
> >         }
> >         /**
> >          * Construct from a string.
> >          */
> >         public MyText(final String string) {
> >             super(string);
> >         }
> >
> >     }
> >     public static class TokenizerMapper
> >             extends Mapper<Object, Text, MyText, IntWritable> {
> >         private final static IntWritable one = new IntWritable(1);
> >         private MyText word = new MyText();
> >         public void map(Object key, Text value, Context context
> >         ) throws IOException, InterruptedException {

Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA