|
|
-
Problem with Reducer emitting a different Key than Mapper
Steve Lewis 2010-06-16, 16:15
Problem with Reducer emitting a different Key than Mapper IO have the FOllowing code where the Mapper emits a custom Key and the reducer isa expected to emit text
Using Hadoop 0.2 on a local instance I asj the reducer to write Text,Text - thiew is even what the IDE says I should do and what I get is the exception below - any bright ideas???
public class PartitionReducer extends Reducer<GenonePartitionKey, Text, Text, Text> {
/** * This method is called once for each key. Most applications will define * their reduce class by overriding this method. The default implementation * is an identity function. */ @Override protected void reduce(GenonePartitionKey key, Iterable<Text> values, Context context) throws IOException, InterruptedException { context.write(new Text("Foo"),new Text("Bar"));
.... }
} 10/06/16 09:08:45 INFO mapred.MapTask: Starting flush of map output 10/06/16 09:08:45 WARN mapred.LocalJobRunner: job_local_0001 java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1123) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: java.io.IOException: wrong key class: class org.apache.hadoop.io.Text is not class org.systemsbiology.hadoop.GenonePartitionKey at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:164) at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880) at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1201) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:165) at org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:23) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) at org.systemsbiology.hadoop.PartitionReducer.run(PartitionReducer.java:259) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1222) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173) -- Steven M. Lewis PhD Institute for Systems Biology Seattle WA
-
Re: Problem with Reducer emitting a different Key than Mapper
Alex Kozlov 2010-06-16, 16:26
Hi Steve, did you do
job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class);
?
Alex K
On Wed, Jun 16, 2010 at 9:15 AM, Steve Lewis <[EMAIL PROTECTED]> wrote:
> Problem with Reducer emitting a different Key than Mapper > IO have the FOllowing code where the Mapper emits a custom Key and the > reducer isa expected to emit text > > Using Hadoop 0.2 on a local instance I asj the reducer to write Text,Text > - thiew is even what the IDE says I should do > and what I get is the exception below - any bright ideas??? > > public class PartitionReducer extends Reducer<GenonePartitionKey, Text, > Text, Text> > { > > /** > * This method is called once for each key. Most applications will > define > * their reduce class by overriding this method. The default > implementation > * is an identity function. > */ > @Override > protected void reduce(GenonePartitionKey key, Iterable<Text> values, > Context context) > throws IOException, InterruptedException > { > context.write(new Text("Foo"),new Text("Bar")); > > .... > } > > } > > > 10/06/16 09:08:45 INFO mapred.MapTask: Starting flush of map output > 10/06/16 09:08:45 WARN mapred.LocalJobRunner: job_local_0001 > java.io.IOException: Spill failed > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1123) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > Caused by: java.io.IOException: wrong key class: class > org.apache.hadoop.io.Text is not class > org.systemsbiology.hadoop.GenonePartitionKey > at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:164) > at > org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880) > at > org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1201) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:165) > at > org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:23) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > at > org.systemsbiology.hadoop.PartitionReducer.run(PartitionReducer.java:259) > at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1222) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173) > > > -- > Steven M. Lewis PhD > Institute for Systems Biology > Seattle WA >
-
Re: Problem with Reducer emitting a different Key than Mapper
Steve Lewis 2010-06-16, 20:27
Yes - that was the problem - thanks
On Wed, Jun 16, 2010 at 9:26 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote:
> Hi Steve, did you do > > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(Text.class); > > ? > > Alex K > > > On Wed, Jun 16, 2010 at 9:15 AM, Steve Lewis <[EMAIL PROTECTED]>wrote: > >> Problem with Reducer emitting a different Key than Mapper >> IO have the FOllowing code where the Mapper emits a custom Key and the >> reducer isa expected to emit text >> >> Using Hadoop 0.2 on a local instance I asj the reducer to write Text,Text >> - thiew is even what the IDE says I should do >> and what I get is the exception below - any bright ideas??? >> >> public class PartitionReducer extends Reducer<GenonePartitionKey, Text, >> Text, Text> >> { >> >> /** >> * This method is called once for each key. Most applications will >> define >> * their reduce class by overriding this method. The default >> implementation >> * is an identity function. >> */ >> @Override >> protected void reduce(GenonePartitionKey key, Iterable<Text> values, >> Context context) >> throws IOException, InterruptedException >> { >> context.write(new Text("Foo"),new Text("Bar")); >> >> .... >> } >> >> } >> >> >> 10/06/16 09:08:45 INFO mapred.MapTask: Starting flush of map output >> 10/06/16 09:08:45 WARN mapred.LocalJobRunner: job_local_0001 >> java.io.IOException: Spill failed >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1123) >> at >> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >> Caused by: java.io.IOException: wrong key class: class >> org.apache.hadoop.io.Text is not class >> org.systemsbiology.hadoop.GenonePartitionKey >> at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:164) >> at >> org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880) >> at >> org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1201) >> at >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >> at >> org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:165) >> at >> org.systemsbiology.hadoop.PartitionReducer.reduce(PartitionReducer.java:23) >> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >> at >> org.systemsbiology.hadoop.PartitionReducer.run(PartitionReducer.java:259) >> at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1222) >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265) >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173) >> >> >> -- >> Steven M. Lewis PhD >> Institute for Systems Biology >> Seattle WA >> > > -- Steven M. Lewis PhD Institute for Systems Biology Seattle WA
|
|