|
|
Arko Provo Mukherjee 2012-04-17, 05:02
Dear All,
I am porting code from the old API to the new API (Context objects) and run on Hadoop 0.20.203.
Job job_first = new Job(); job_first.setJarByClass(My.class); job_first.setNumReduceTasks(no_of_reduce_tasks); job_first.setJobName("My_Job");
FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); job_first.setMapperClass(Map_First.class); job_first.setReducerClass(Reduce_First.class); job_first.setMapOutputKeyClass(IntWritable.class); job_first.setMapOutputValueClass(Text.class); job_first.setOutputKeyClass(NullWritable.class); job_first.setOutputValueClass(Text.class); job_first.waitForCompletion(true);
The problem I am facing is that instead of emitting values to reducers, the mappers are directly writing their output in the OutputPath and the reducers and not processing anything.
As read from the online materials that are available both my Map and Reduce method uses the context.write method to emit the values.
Please help. Thanks a lot in advance!!
Warm regards Arko
Devaraj k 2012-04-17, 05:48
Hi Arko,
What is value of 'no_of_reduce_tasks'?
If no of reduce tasks are 0, then the map task will directly write map output into the Job output path.
Thanks Devaraj
________________________________________ From: Arko Provo Mukherjee [[EMAIL PROTECTED]] Sent: Tuesday, April 17, 2012 10:32 AM To: [EMAIL PROTECTED] Subject: Reducer not firing
Dear All,
I am porting code from the old API to the new API (Context objects) and run on Hadoop 0.20.203.
Job job_first = new Job();
job_first.setJarByClass(My.class); job_first.setNumReduceTasks(no_of_reduce_tasks); job_first.setJobName("My_Job");
FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );
job_first.setMapperClass(Map_First.class); job_first.setReducerClass(Reduce_First.class);
job_first.setMapOutputKeyClass(IntWritable.class); job_first.setMapOutputValueClass(Text.class);
job_first.setOutputKeyClass(NullWritable.class); job_first.setOutputValueClass(Text.class);
job_first.waitForCompletion(true);
The problem I am facing is that instead of emitting values to reducers, the mappers are directly writing their output in the OutputPath and the reducers and not processing anything.
As read from the online materials that are available both my Map and Reduce method uses the context.write method to emit the values.
Please help. Thanks a lot in advance!!
Warm regards Arko
Arko Provo Mukherjee 2012-04-17, 08:37
Hello,
Many thanks for the reply.
The 'no_of_reduce_tasks' is set to 2. I have a print statement before the code I pasted below to check that.
Also I can find two output files part-r-00000 and part-r-00001. But they contain the values that has been outputted by the Mapper logic.
Please let me know what I can check further.
Thanks a lot in advance!
Warm regards Arko
On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]> wrote: > Hi Arko, > > What is value of 'no_of_reduce_tasks'? > > If no of reduce tasks are 0, then the map task will directly write map output into the Job output path. > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > Sent: Tuesday, April 17, 2012 10:32 AM > To: [EMAIL PROTECTED] > Subject: Reducer not firing > > Dear All, > > I am porting code from the old API to the new API (Context objects) > and run on Hadoop 0.20.203. > > Job job_first = new Job(); > > job_first.setJarByClass(My.class); > job_first.setNumReduceTasks(no_of_reduce_tasks); > job_first.setJobName("My_Job"); > > FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); > FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); > > job_first.setMapperClass(Map_First.class); > job_first.setReducerClass(Reduce_First.class); > > job_first.setMapOutputKeyClass(IntWritable.class); > job_first.setMapOutputValueClass(Text.class); > > job_first.setOutputKeyClass(NullWritable.class); > job_first.setOutputValueClass(Text.class); > > job_first.waitForCompletion(true); > > The problem I am facing is that instead of emitting values to > reducers, the mappers are directly writing their output in the > OutputPath and the reducers and not processing anything. > > As read from the online materials that are available both my Map and > Reduce method uses the context.write method to emit the values. > > Please help. Thanks a lot in advance!! > > Warm regards > Arko
Devaraj k 2012-04-17, 09:30
Can you check the task attempt logs in your cluster and find out what is happening in the reduce phase. By default task attempt logs present in $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your reducer which is leading to this output.
Thanks Devaraj
________________________________________ From: Arko Provo Mukherjee [[EMAIL PROTECTED]] Sent: Tuesday, April 17, 2012 2:07 PM To: [EMAIL PROTECTED] Subject: Re: Reducer not firing
Hello,
Many thanks for the reply.
The 'no_of_reduce_tasks' is set to 2. I have a print statement before the code I pasted below to check that.
Also I can find two output files part-r-00000 and part-r-00001. But they contain the values that has been outputted by the Mapper logic.
Please let me know what I can check further.
Thanks a lot in advance!
Warm regards Arko
On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]> wrote: > Hi Arko, > > What is value of 'no_of_reduce_tasks'? > > If no of reduce tasks are 0, then the map task will directly write map output into the Job output path. > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > Sent: Tuesday, April 17, 2012 10:32 AM > To: [EMAIL PROTECTED] > Subject: Reducer not firing > > Dear All, > > I am porting code from the old API to the new API (Context objects) > and run on Hadoop 0.20.203. > > Job job_first = new Job(); > > job_first.setJarByClass(My.class); > job_first.setNumReduceTasks(no_of_reduce_tasks); > job_first.setJobName("My_Job"); > > FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); > FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); > > job_first.setMapperClass(Map_First.class); > job_first.setReducerClass(Reduce_First.class); > > job_first.setMapOutputKeyClass(IntWritable.class); > job_first.setMapOutputValueClass(Text.class); > > job_first.setOutputKeyClass(NullWritable.class); > job_first.setOutputValueClass(Text.class); > > job_first.waitForCompletion(true); > > The problem I am facing is that instead of emitting values to > reducers, the mappers are directly writing their output in the > OutputPath and the reducers and not processing anything. > > As read from the online materials that are available both my Map and > Reduce method uses the context.write method to emit the values. > > Please help. Thanks a lot in advance!! > > Warm regards > Arko
kasi subrahmanyam 2012-04-17, 13:40
Could you comment the property where you are setting the number of reducer tasks and see the behaviour of the program once. If you already tried could you share the output
On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Can you check the task attempt logs in your cluster and find out what is > happening in the reduce phase. By default task attempt logs present in > $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your > reducer which is leading to this output. > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > Sent: Tuesday, April 17, 2012 2:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Reducer not firing > > Hello, > > Many thanks for the reply. > > The 'no_of_reduce_tasks' is set to 2. I have a print statement before > the code I pasted below to check that. > > Also I can find two output files part-r-00000 and part-r-00001. But > they contain the values that has been outputted by the Mapper logic. > > Please let me know what I can check further. > > Thanks a lot in advance! > > Warm regards > Arko > > On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]> wrote: > > Hi Arko, > > > > What is value of 'no_of_reduce_tasks'? > > > > If no of reduce tasks are 0, then the map task will directly write map > output into the Job output path. > > > > Thanks > > Devaraj > > > > ________________________________________ > > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > > Sent: Tuesday, April 17, 2012 10:32 AM > > To: [EMAIL PROTECTED] > > Subject: Reducer not firing > > > > Dear All, > > > > I am porting code from the old API to the new API (Context objects) > > and run on Hadoop 0.20.203. > > > > Job job_first = new Job(); > > > > job_first.setJarByClass(My.class); > > job_first.setNumReduceTasks(no_of_reduce_tasks); > > job_first.setJobName("My_Job"); > > > > FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); > > FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); > > > > job_first.setMapperClass(Map_First.class); > > job_first.setReducerClass(Reduce_First.class); > > > > job_first.setMapOutputKeyClass(IntWritable.class); > > job_first.setMapOutputValueClass(Text.class); > > > > job_first.setOutputKeyClass(NullWritable.class); > > job_first.setOutputValueClass(Text.class); > > > > job_first.waitForCompletion(true); > > > > The problem I am facing is that instead of emitting values to > > reducers, the mappers are directly writing their output in the > > OutputPath and the reducers and not processing anything. > > > > As read from the online materials that are available both my Map and > > Reduce method uses the context.write method to emit the values. > > > > Please help. Thanks a lot in advance!! > > > > Warm regards > > Arko >
Bejoy KS 2012-04-17, 14:03
Hi Akro From the naming of output files, your job has the reduce phase. But the reducer being used is the IdentityReducer instead of your custom reducer. That is the reason you are seeing the same map output in the output files as well. You need to evaluate your code and logs to see why IdentityReducer is being triggered.
Regards Bejoy KS
Sent from handheld, please excuse typos.
-----Original Message----- From: kasi subrahmanyam <[EMAIL PROTECTED]> Date: Tue, 17 Apr 2012 19:10:33 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: Reducer not firing
Could you comment the property where you are setting the number of reducer tasks and see the behaviour of the program once. If you already tried could you share the output
On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Can you check the task attempt logs in your cluster and find out what is > happening in the reduce phase. By default task attempt logs present in > $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your > reducer which is leading to this output. > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > Sent: Tuesday, April 17, 2012 2:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Reducer not firing > > Hello, > > Many thanks for the reply. > > The 'no_of_reduce_tasks' is set to 2. I have a print statement before > the code I pasted below to check that. > > Also I can find two output files part-r-00000 and part-r-00001. But > they contain the values that has been outputted by the Mapper logic. > > Please let me know what I can check further. > > Thanks a lot in advance! > > Warm regards > Arko > > On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]> wrote: > > Hi Arko, > > > > What is value of 'no_of_reduce_tasks'? > > > > If no of reduce tasks are 0, then the map task will directly write map > output into the Job output path. > > > > Thanks > > Devaraj > > > > ________________________________________ > > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > > Sent: Tuesday, April 17, 2012 10:32 AM > > To: [EMAIL PROTECTED] > > Subject: Reducer not firing > > > > Dear All, > > > > I am porting code from the old API to the new API (Context objects) > > and run on Hadoop 0.20.203. > > > > Job job_first = new Job(); > > > > job_first.setJarByClass(My.class); > > job_first.setNumReduceTasks(no_of_reduce_tasks); > > job_first.setJobName("My_Job"); > > > > FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); > > FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); > > > > job_first.setMapperClass(Map_First.class); > > job_first.setReducerClass(Reduce_First.class); > > > > job_first.setMapOutputKeyClass(IntWritable.class); > > job_first.setMapOutputValueClass(Text.class); > > > > job_first.setOutputKeyClass(NullWritable.class); > > job_first.setOutputValueClass(Text.class); > > > > job_first.waitForCompletion(true); > > > > The problem I am facing is that instead of emitting values to > > reducers, the mappers are directly writing their output in the > > OutputPath and the reducers and not processing anything. > > > > As read from the online materials that are available both my Map and > > Reduce method uses the context.write method to emit the values. > > > > Please help. Thanks a lot in advance!! > > > > Warm regards > > Arko >
Steven Willis 2012-04-17, 20:19
Try putting @Override before your reduce method to make sure you're overriding the method properly. You'll get a compile time error if not.
-Steven Willis From: Bejoy KS [mailto:[EMAIL PROTECTED]] Sent: Tuesday, April 17, 2012 10:03 AM To: [EMAIL PROTECTED] Subject: Re: Reducer not firing
Hi Akro >From the naming of output files, your job has the reduce phase. But the reducer being used is the IdentityReducer instead of your custom reducer. That is the reason you are seeing the same map output in the output files as well. You need to evaluate your code and logs to see why IdentityReducer is being triggered. Regards Bejoy KS
Sent from handheld, please excuse typos. ________________________________ From: kasi subrahmanyam <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> Date: Tue, 17 Apr 2012 19:10:33 +0530 To: <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> ReplyTo: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: Reducer not firing
Could you comment the property where you are setting the number of reducer tasks and see the behaviour of the program once. If you already tried could you share the output On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Can you check the task attempt logs in your cluster and find out what is happening in the reduce phase. By default task attempt logs present in $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your reducer which is leading to this output.
Thanks Devaraj
________________________________________ From: Arko Provo Mukherjee [[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Tuesday, April 17, 2012 2:07 PM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: Re: Reducer not firing
Hello,
Many thanks for the reply.
The 'no_of_reduce_tasks' is set to 2. I have a print statement before the code I pasted below to check that.
Also I can find two output files part-r-00000 and part-r-00001. But they contain the values that has been outputted by the Mapper logic.
Please let me know what I can check further.
Thanks a lot in advance!
Warm regards Arko
On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > Hi Arko, > > What is value of 'no_of_reduce_tasks'? > > If no of reduce tasks are 0, then the map task will directly write map output into the Job output path. > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] > Sent: Tuesday, April 17, 2012 10:32 AM > To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> > Subject: Reducer not firing > > Dear All, > > I am porting code from the old API to the new API (Context objects) > and run on Hadoop 0.20.203. > > Job job_first = new Job(); > > job_first.setJarByClass(My.class); > job_first.setNumReduceTasks(no_of_reduce_tasks); > job_first.setJobName("My_Job"); > > FileInputFormat.addInputPath( job_first, new Path (Input_Path) ); > FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) ); > > job_first.setMapperClass(Map_First.class); > job_first.setReducerClass(Reduce_First.class); > > job_first.setMapOutputKeyClass(IntWritable.class); > job_first.setMapOutputValueClass(Text.class); > > job_first.setOutputKeyClass(NullWritable.class); > job_first.setOutputValueClass(Text.class); > > job_first.waitForCompletion(true); > > The problem I am facing is that instead of emitting values to > reducers, the mappers are directly writing their output in the > OutputPath and the reducers and not processing anything. > > As read from the online materials that are available both my Map and > Reduce method uses the context.write method to emit the values. > > Please help. Thanks a lot in advance!! > > Warm regards > Arko
Arko Provo Mukherjee 2012-04-17, 23:16
Hello,
Thanks everyone for helping me. Here are my observations:
Devaraj - I didn't find any bug in the log files. In fact, none of the print statements in my reducer are even appearing in the logs. I can share the syslogs if you want. I didn't paste them here so that the email doesn't get cluttered.
Kasi - Thanks for the suggestion. I tired but got the same output. The system just created 1 reducer as my test data set is small.
Bejoy - Can you please advice how I can pinpoint whether the IdentityReducer is being used or not.
Steven - I tried compiling with your suggestion. However if I put a @Override on top of my reduce method, I get the following error: "method does not override or implement a method from a supertype" The code compiles without it. I do have an @Override on top of my map method though. public class Reduce_First extends Reducer<IntWritable, Text, NullWritable, Text> { public void reduce (IntWritable key, Iterator<Text> values, Context context) throws IOException, InterruptedException { while ( values.hasNext() ) // Process
// Finally emit } }
Thanks a lot again! Warm regards Arko On Tue, Apr 17, 2012 at 3:19 PM, Steven Willis <[EMAIL PROTECTED]> wrote: > Try putting @Override before your reduce method to make sure you're > overriding the method properly. You’ll get a compile time error if not. > > > > -Steven Willis > > > > > > From: Bejoy KS [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, April 17, 2012 10:03 AM > > > To: [EMAIL PROTECTED] > Subject: Re: Reducer not firing > > > > Hi Akro > From the naming of output files, your job has the reduce phase. But the > reducer being used is the IdentityReducer instead of your custom reducer. > That is the reason you are seeing the same map output in the output files as > well. You need to evaluate your code and logs to see why IdentityReducer is > being triggered. > > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > > ________________________________ > > From: kasi subrahmanyam <[EMAIL PROTECTED]> > > Date: Tue, 17 Apr 2012 19:10:33 +0530 > > To: <[EMAIL PROTECTED]> > > ReplyTo: [EMAIL PROTECTED] > > Subject: Re: Reducer not firing > > > > Could you comment the property where you are setting the number of reducer > tasks and see the behaviour of the program once. > If you already tried could you share the output > > On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k <[EMAIL PROTECTED]> wrote: > > Can you check the task attempt logs in your cluster and find out what is > happening in the reduce phase. By default task attempt logs present in > $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your > reducer which is leading to this output. > > > Thanks > Devaraj > > ________________________________________ > From: Arko Provo Mukherjee [[EMAIL PROTECTED]] > > Sent: Tuesday, April 17, 2012 2:07 PM > To: [EMAIL PROTECTED] > Subject: Re: Reducer not firing > > > Hello, > > Many thanks for the reply. > > The 'no_of_reduce_tasks' is set to 2. I have a print statement before > the code I pasted below to check that. > > Also I can find two output files part-r-00000 and part-r-00001. But > they contain the values that has been outputted by the Mapper logic. > > Please let me know what I can check further. > > Thanks a lot in advance! > > Warm regards > Arko > > On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k <[EMAIL PROTECTED]> wrote: >> Hi Arko, >> >> What is value of 'no_of_reduce_tasks'? >> >> If no of reduce tasks are 0, then the map task will directly write map >> output into the Job output path. >> >> Thanks >> Devaraj >> >> ________________________________________ >> From: Arko Provo Mukherjee [[EMAIL PROTECTED]] >> Sent: Tuesday, April 17, 2012 10:32 AM >> To: [EMAIL PROTECTED] >> Subject: Reducer not firing >> >> Dear All, >> >> I am porting code from the old API to the new API (Context objects)
George Datskos 2012-04-17, 23:59
Arko,
Change Iterator to Iterable George On 2012/04/18 8:16, Arko Provo Mukherjee wrote: > Hello, > > Thanks everyone for helping me. Here are my observations: > > Devaraj - I didn't find any bug in the log files. In fact, none of the > print statements in my reducer are even appearing in the logs. I can > share the syslogs if you want. I didn't paste them here so that the > email doesn't get cluttered. > > Kasi - Thanks for the suggestion. I tired but got the same output. > The system just created 1 reducer as my test data set is small. > > Bejoy - Can you please advice how I can pinpoint whether the > IdentityReducer is being used or not. > > Steven - I tried compiling with your suggestion. However if I put a > @Override on top of my reduce method, I get the following error: > "method does not override or implement a method from a supertype" > The code compiles without it. I do have an @Override on top of my map > method though. > public class Reduce_First extends Reducer<IntWritable, Text, > NullWritable, Text> > { > public void reduce (IntWritable key, Iterator<Text> values, > Context context) throws IOException, InterruptedException > { > while ( values.hasNext() ) > // Process > > // Finally emit > } > } > > Thanks a lot again! > Warm regards > Arko > > > On Tue, Apr 17, 2012 at 3:19 PM, Steven Willis<[EMAIL PROTECTED]> wrote: >> Try putting @Override before your reduce method to make sure you're >> overriding the method properly. You�ll get a compile time error if not. >> >> >> >> -Steven Willis >> >> >> >> >> >> From: Bejoy KS [mailto:[EMAIL PROTECTED]] >> Sent: Tuesday, April 17, 2012 10:03 AM >> >> >> To: [EMAIL PROTECTED] >> Subject: Re: Reducer not firing >> >> >> >> Hi Akro >> From the naming of output files, your job has the reduce phase. But the >> reducer being used is the IdentityReducer instead of your custom reducer. >> That is the reason you are seeing the same map output in the output files as >> well. You need to evaluate your code and logs to see why IdentityReducer is >> being triggered. >> >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> >> ________________________________ >> >> From: kasi subrahmanyam<[EMAIL PROTECTED]> >> >> Date: Tue, 17 Apr 2012 19:10:33 +0530 >> >> To:<[EMAIL PROTECTED]> >> >> ReplyTo: [EMAIL PROTECTED] >> >> Subject: Re: Reducer not firing >> >> >> >> Could you comment the property where you are setting the number of reducer >> tasks and see the behaviour of the program once. >> If you already tried could you share the output >> >> On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k<[EMAIL PROTECTED]> wrote: >> >> Can you check the task attempt logs in your cluster and find out what is >> happening in the reduce phase. By default task attempt logs present in >> $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your >> reducer which is leading to this output. >> >> >> Thanks >> Devaraj >> >> ________________________________________ >> From: Arko Provo Mukherjee [[EMAIL PROTECTED]] >> >> Sent: Tuesday, April 17, 2012 2:07 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Reducer not firing >> >> >> Hello, >> >> Many thanks for the reply. >> >> The 'no_of_reduce_tasks' is set to 2. I have a print statement before >> the code I pasted below to check that. >> >> Also I can find two output files part-r-00000 and part-r-00001. But >> they contain the values that has been outputted by the Mapper logic. >> >> Please let me know what I can check further. >> >> Thanks a lot in advance! >> >> Warm regards >> Arko >> >> On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k<[EMAIL PROTECTED]> wrote: >>> Hi Arko, >>> >>> What is value of 'no_of_reduce_tasks'? >>> >>> If no of reduce tasks are 0, then the map task will directly write map >>> output into the Job output path. >>> >>> Thanks >>> Devaraj >>> >>> ________________________________________
Arko Provo Mukherjee 2012-04-18, 00:22
Hello George,
It worked. Thanks so much!! Bad typo while porting :(
Thanks again to everyone who helped!!
Warm regards Arko
On Tue, Apr 17, 2012 at 6:59 PM, George Datskos <[EMAIL PROTECTED]> wrote: > Arko, > > Change Iterator to Iterable > > > George > > > > On 2012/04/18 8:16, Arko Provo Mukherjee wrote: >> >> Hello, >> >> Thanks everyone for helping me. Here are my observations: >> >> Devaraj - I didn't find any bug in the log files. In fact, none of the >> print statements in my reducer are even appearing in the logs. I can >> share the syslogs if you want. I didn't paste them here so that the >> email doesn't get cluttered. >> >> Kasi - Thanks for the suggestion. I tired but got the same output. >> The system just created 1 reducer as my test data set is small. >> >> Bejoy - Can you please advice how I can pinpoint whether the >> IdentityReducer is being used or not. >> >> Steven - I tried compiling with your suggestion. However if I put a >> @Override on top of my reduce method, I get the following error: >> "method does not override or implement a method from a supertype" >> The code compiles without it. I do have an @Override on top of my map >> method though. >> public class Reduce_First extends Reducer<IntWritable, Text, >> NullWritable, Text> >> { >> public void reduce (IntWritable key, Iterator<Text> values, >> Context context) throws IOException, InterruptedException >> { >> while ( values.hasNext() ) >> // Process >> >> // Finally emit >> } >> } >> >> Thanks a lot again! >> Warm regards >> Arko >> >> >> On Tue, Apr 17, 2012 at 3:19 PM, Steven Willis<[EMAIL PROTECTED]> >> wrote: >>> >>> Try putting @Override before your reduce method to make sure you're >>> overriding the method properly. You’ll get a compile time error if not. >>> >>> >>> >>> -Steven Willis >>> >>> >>> >>> >>> >>> From: Bejoy KS [mailto:[EMAIL PROTECTED]] >>> Sent: Tuesday, April 17, 2012 10:03 AM >>> >>> >>> To: [EMAIL PROTECTED] >>> Subject: Re: Reducer not firing >>> >>> >>> >>> Hi Akro >>> From the naming of output files, your job has the reduce phase. But the >>> reducer being used is the IdentityReducer instead of your custom reducer. >>> That is the reason you are seeing the same map output in the output files >>> as >>> well. You need to evaluate your code and logs to see why IdentityReducer >>> is >>> being triggered. >>> >>> Regards >>> Bejoy KS >>> >>> Sent from handheld, please excuse typos. >>> >>> ________________________________ >>> >>> From: kasi subrahmanyam<[EMAIL PROTECTED]> >>> >>> Date: Tue, 17 Apr 2012 19:10:33 +0530 >>> >>> To:<[EMAIL PROTECTED]> >>> >>> ReplyTo: [EMAIL PROTECTED] >>> >>> Subject: Re: Reducer not firing >>> >>> >>> >>> Could you comment the property where you are setting the number of >>> reducer >>> tasks and see the behaviour of the program once. >>> If you already tried could you share the output >>> >>> On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k<[EMAIL PROTECTED]> wrote: >>> >>> Can you check the task attempt logs in your cluster and find out what is >>> happening in the reduce phase. By default task attempt logs present in >>> $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your >>> reducer which is leading to this output. >>> >>> >>> Thanks >>> Devaraj >>> >>> ________________________________________ >>> From: Arko Provo Mukherjee [[EMAIL PROTECTED]] >>> >>> Sent: Tuesday, April 17, 2012 2:07 PM >>> To: [EMAIL PROTECTED] >>> Subject: Re: Reducer not firing >>> >>> >>> Hello, >>> >>> Many thanks for the reply. >>> >>> The 'no_of_reduce_tasks' is set to 2. I have a print statement before >>> the code I pasted below to check that. >>> >>> Also I can find two output files part-r-00000 and part-r-00001. But >>> they contain the values that has been outputted by the Mapper logic. >>> >>> Please let me know what I can check further. >>> >>
|
|