|
|
-
Reducers without output files
Arko Provo Mukherjee 2011-09-15, 03:26
Hello Everyone,
I have a small issue with my Reducer that I am trying to figure out and wanted some advice.
In the reducer, when writing to the output file as declared in FileOutputFormat.setOutputPath() I want to write only the key and not the value when I am calling output.collect().
Is there a way I can ignore the key part?
Else,
Can I write a Reducer function that doesn't do a output.collect()??
Say I omit the FileOutputFormat.setOutputPath() in the Driver Class.
I can then manually write the output to HDFS in the format I like.
Is this legal way to do stuff?
Many thanks in advance! Warm Regards Arko
-
Re: Reducers without output files
bejoy.hadoop@... 2011-09-15, 04:09
Hi Akro You can achieve the same within the existing mapreduce frame work itself. Give a NullWritable in place of reducer output value in reduce function. In your driver class as well mention the output value type as NullWritable.
------Original Message------ From: Arko Provo Mukherjee To: [EMAIL PROTECTED] ReplyTo: [EMAIL PROTECTED] Subject: Reducers without output files Sent: Sep 15, 2011 08:56
Hello Everyone,
I have a small issue with my Reducer that I am trying to figure out and wanted some advice.
In the reducer, when writing to the output file as declared in FileOutputFormat.setOutputPath() I want to write only the key and not the value when I am calling output.collect().
Is there a way I can ignore the key part?
Else,
Can I write a Reducer function that doesn't do a output.collect()??
Say I omit the FileOutputFormat.setOutputPath() in the Driver Class.
I can then manually write the output to HDFS in the format I like.
Is this legal way to do stuff?
Many thanks in advance! Warm Regards Arko Regards Bejoy K S
-
Re: Reducers without output files
bejoy.hadoop@... 2011-09-15, 04:12
Akro To add on, if you want to ignore the key part then substitute key with NullWritable. And do the desired modification in driver class in place of output key type. Hope it helps Regards Bejoy K S
-----Original Message----- From: [EMAIL PROTECTED] Date: Thu, 15 Sep 2011 04:09:12 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: Reducers without output files
Hi Akro You can achieve the same within the existing mapreduce frame work itself. Give a NullWritable in place of reducer output value in reduce function. In your driver class as well mention the output value type as NullWritable.
------Original Message------ From: Arko Provo Mukherjee To: [EMAIL PROTECTED] ReplyTo: [EMAIL PROTECTED] Subject: Reducers without output files Sent: Sep 15, 2011 08:56
Hello Everyone,
I have a small issue with my Reducer that I am trying to figure out and wanted some advice.
In the reducer, when writing to the output file as declared in FileOutputFormat.setOutputPath() I want to write only the key and not the value when I am calling output.collect().
Is there a way I can ignore the key part?
Else,
Can I write a Reducer function that doesn't do a output.collect()??
Say I omit the FileOutputFormat.setOutputPath() in the Driver Class.
I can then manually write the output to HDFS in the format I like.
Is this legal way to do stuff?
Many thanks in advance! Warm Regards Arko Regards Bejoy K S
-
Re: Reducers without output files
Arko Provo Mukherjee 2011-09-15, 05:16
Hello,
Many thanks for your reply!
So to clarify, I should do the following: public static class Reduce extends MapReduceBase implements Reducer<IntWritable, Text, NullWritable, Text> {
reduce () { // Pseudo reduce funtion - ignoring the proper syntax
// The processing goes here. output.collect ( new NullWritable(), new Text(output_string) ); }
}
Finally in the main method of the Driver Class:
// For the Map Class jobconf.setMapOutputKeyClass(IntWritable.class); jobconf.setMapOutputValueClass(Text.class);
// For the Reduce Class jobconf.setOutputKeyClass(NullWritable.class); jobconf.setOutputValueClass(Text.class);
Please do correct me if my understanding is wrong.
Thanks again for your help!
Warm Regards Arko
On Wed, Sep 14, 2011 at 11:12 PM, <[EMAIL PROTECTED]> wrote: > Akro > To add on, if you want to ignore the key part then substitute key with NullWritable. And do the desired modification in driver class in place of output key type. > > Hope it helps > Regards > Bejoy K S > > -----Original Message----- > From: [EMAIL PROTECTED] > Date: Thu, 15 Sep 2011 04:09:12 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: Re: Reducers without output files > > Hi Akro > You can achieve the same within the existing mapreduce frame work itself. Give a NullWritable in place of reducer output value in reduce function. In your driver class as well mention the output value type as NullWritable. > > ------Original Message------ > From: Arko Provo Mukherjee > To: [EMAIL PROTECTED] > ReplyTo: [EMAIL PROTECTED] > Subject: Reducers without output files > Sent: Sep 15, 2011 08:56 > > Hello Everyone, > > I have a small issue with my Reducer that I am trying to figure out > and wanted some advice. > > In the reducer, when writing to the output file as declared in > FileOutputFormat.setOutputPath() I want to write only the key and not > the value when I am calling output.collect(). > > Is there a way I can ignore the key part? > > Else, > > Can I write a Reducer function that doesn't do a output.collect()?? > > Say I omit the FileOutputFormat.setOutputPath() in the Driver Class. > > I can then manually write the output to HDFS in the format I like. > > Is this legal way to do stuff? > > Many thanks in advance! > Warm Regards > Arko > > > Regards > Bejoy K S
-
Re: Reducers without output files
bejoy.hadoop@... 2011-09-15, 06:02
Avro You are right. A minor correction, you can't use new NullWritable() and create an object. NullWritable is immutable and Singleton unlike the other Writables in hadoop. When you nedd to use NullWritable instance you can give NullWritable.get(), which would do the job. Ie output.collect ( NullWritable.get(), new Text(output_string) );
Regards Bejoy K S
-----Original Message----- From: Arko Provo Mukherjee <[EMAIL PROTECTED]> Date: Thu, 15 Sep 2011 00:16:27 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: Reducers without output files
Hello,
Many thanks for your reply!
So to clarify, I should do the following: public static class Reduce extends MapReduceBase implements Reducer<IntWritable, Text, NullWritable, Text> {
reduce () { // Pseudo reduce funtion - ignoring the proper syntax
// The processing goes here. output.collect ( new NullWritable(), new Text(output_string) ); }
}
Finally in the main method of the Driver Class:
// For the Map Class jobconf.setMapOutputKeyClass(IntWritable.class); jobconf.setMapOutputValueClass(Text.class);
// For the Reduce Class jobconf.setOutputKeyClass(NullWritable.class); jobconf.setOutputValueClass(Text.class);
Please do correct me if my understanding is wrong.
Thanks again for your help!
Warm Regards Arko
On Wed, Sep 14, 2011 at 11:12 PM, <[EMAIL PROTECTED]> wrote: > Akro > To add on, if you want to ignore the key part then substitute key with NullWritable. And do the desired modification in driver class in place of output key type. > > Hope it helps > Regards > Bejoy K S > > -----Original Message----- > From: [EMAIL PROTECTED] > Date: Thu, 15 Sep 2011 04:09:12 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: Re: Reducers without output files > > Hi Akro > You can achieve the same within the existing mapreduce frame work itself. Give a NullWritable in place of reducer output value in reduce function. In your driver class as well mention the output value type as NullWritable. > > ------Original Message------ > From: Arko Provo Mukherjee > To: [EMAIL PROTECTED] > ReplyTo: [EMAIL PROTECTED] > Subject: Reducers without output files > Sent: Sep 15, 2011 08:56 > > Hello Everyone, > > I have a small issue with my Reducer that I am trying to figure out > and wanted some advice. > > In the reducer, when writing to the output file as declared in > FileOutputFormat.setOutputPath() I want to write only the key and not > the value when I am calling output.collect(). > > Is there a way I can ignore the key part? > > Else, > > Can I write a Reducer function that doesn't do a output.collect()?? > > Say I omit the FileOutputFormat.setOutputPath() in the Driver Class. > > I can then manually write the output to HDFS in the format I like. > > Is this legal way to do stuff? > > Many thanks in advance! > Warm Regards > Arko > > > Regards > Bejoy K S
-
Re: Reducers without output files
Arko Provo Mukherjee 2011-09-15, 06:47
Great!! Thanks so much for the help! Warm regards Arko
On Sep 15, 2011, at 1:02 AM, [EMAIL PROTECTED] wrote:
> Avro > You are right. A minor correction, you can't use new NullWritable() and create an object. NullWritable is immutable and Singleton unlike the other Writables in hadoop. When you nedd to use NullWritable instance you can give NullWritable.get(), which would do the job. > Ie > output.collect ( NullWritable.get(), new Text(output_string) ); > > Regards > Bejoy K S > > -----Original Message----- > From: Arko Provo Mukherjee <[EMAIL PROTECTED]> > Date: Thu, 15 Sep 2011 00:16:27 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: Re: Reducers without output files > > Hello, > > Many thanks for your reply! > > So to clarify, I should do the following: > public static class Reduce extends MapReduceBase implements > Reducer<IntWritable, Text, NullWritable, Text> { > > reduce () { // Pseudo reduce funtion - ignoring the proper syntax > > // The processing goes here. > output.collect ( new NullWritable(), new Text(output_string) ); > > > } > > } > > Finally in the main method of the Driver Class: > > // For the Map Class > jobconf.setMapOutputKeyClass(IntWritable.class); > jobconf.setMapOutputValueClass(Text.class); > > // For the Reduce Class > jobconf.setOutputKeyClass(NullWritable.class); > jobconf.setOutputValueClass(Text.class); > > Please do correct me if my understanding is wrong. > > Thanks again for your help! > > Warm Regards > Arko > > On Wed, Sep 14, 2011 at 11:12 PM, <[EMAIL PROTECTED]> wrote: >> Akro >> To add on, if you want to ignore the key part then substitute key with NullWritable. And do the desired modification in driver class in place of output key type. >> >> Hope it helps >> Regards >> Bejoy K S >> >> -----Original Message----- >> From: [EMAIL PROTECTED] >> Date: Thu, 15 Sep 2011 04:09:12 >> To: <[EMAIL PROTECTED]> >> Reply-To: [EMAIL PROTECTED] >> Subject: Re: Reducers without output files >> >> Hi Akro >> You can achieve the same within the existing mapreduce frame work itself. Give a NullWritable in place of reducer output value in reduce function. In your driver class as well mention the output value type as NullWritable. >> >> ------Original Message------ >> From: Arko Provo Mukherjee >> To: [EMAIL PROTECTED] >> ReplyTo: [EMAIL PROTECTED] >> Subject: Reducers without output files >> Sent: Sep 15, 2011 08:56 >> >> Hello Everyone, >> >> I have a small issue with my Reducer that I am trying to figure out >> and wanted some advice. >> >> In the reducer, when writing to the output file as declared in >> FileOutputFormat.setOutputPath() I want to write only the key and not >> the value when I am calling output.collect(). >> >> Is there a way I can ignore the key part? >> >> Else, >> >> Can I write a Reducer function that doesn't do a output.collect()?? >> >> Say I omit the FileOutputFormat.setOutputPath() in the Driver Class. >> >> I can then manually write the output to HDFS in the format I like. >> >> Is this legal way to do stuff? >> >> Many thanks in advance! >> Warm Regards >> Arko >> >> >> Regards >> Bejoy K S
|
|