|
|
-
Intermediate files generated.
Pramy Bhats 2010-07-01, 22:07
Hi,
I am using hadoop framework for writing MapReduce jobs. I want to redirect the output of Map into files of my choice and later use those files as input for Reduce phase. Could you please suggest, how to proceed for it ?
thanks, --PB.
-
Re: Intermediate files generated.
Alex Loddengaard 2010-07-01, 22:14
You could use the HDFS API from within your mapper, and run with 0 reducers.
Alex
On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <[EMAIL PROTECTED]>wrote:
> Hi, > > I am using hadoop framework for writing MapReduce jobs. I want to redirect > the output of Map into files of my choice and later use those files as > input > for Reduce phase. > > > Could you please suggest, how to proceed for it ? > > thanks, > --PB. >
-
Re: Intermediate files generated.
Pramy Bhats 2010-07-01, 23:08
Hi Alex,
I don't think this is what I am looking for. Essential, I wish to run both mapper as well as reducer. But at the same time, i wish to make sure that the temp files that are used between mappers and reducers are of my choice. Here, the choice means that I can specify the files in HDFS that can be used as temp files.
thanks, --PB.
On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
> You could use the HDFS API from within your mapper, and run with 0 > reducers. > > Alex > > On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <[EMAIL PROTECTED] > >wrote: > > > Hi, > > > > I am using hadoop framework for writing MapReduce jobs. I want to > redirect > > the output of Map into files of my choice and later use those files as > > input > > for Reduce phase. > > > > > > Could you please suggest, how to proceed for it ? > > > > thanks, > > --PB. > > >
-
Re: Intermediate files generated.
Hemanth Yamijala 2010-07-02, 01:52
Alex,
> I don't think this is what I am looking for. Essential, I wish to run both > mapper as well as reducer. But at the same time, i wish to make sure that > the temp files that are used between mappers and reducers are of my choice. > Here, the choice means that I can specify the files in HDFS that can be used > as temp files.
Could you explain why you want to do this ?
> > thanks, > --PB. > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED]> wrote: > >> You could use the HDFS API from within your mapper, and run with 0 >> reducers. >> >> Alex >> >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <[EMAIL PROTECTED] >> >wrote: >> >> > Hi, >> > >> > I am using hadoop framework for writing MapReduce jobs. I want to >> redirect >> > the output of Map into files of my choice and later use those files as >> > input >> > for Reduce phase. >> > >> > >> > Could you please suggest, how to proceed for it ? >> > >> > thanks, >> > --PB. >> > >> >
-
Re: Intermediate files generated.
Pramy Bhats 2010-07-02, 06:05
Hi Hemanth,
I need to use the output of the mapper for some other application. As a result, if I can redirect the output of the map in temp files of my choice (which are stored on hdfs) then i can reuse the output later. At the same time, the succeeding reducer can read the input from this temp files without any overhead.
thanks, --PB
On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]> wrote:
> Alex, > > > I don't think this is what I am looking for. Essential, I wish to run > both > > mapper as well as reducer. But at the same time, i wish to make sure that > > the temp files that are used between mappers and reducers are of my > choice. > > Here, the choice means that I can specify the files in HDFS that can be > used > > as temp files. > > Could you explain why you want to do this ? > > > > > thanks, > > --PB. > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED]> > wrote: > > > >> You could use the HDFS API from within your mapper, and run with 0 > >> reducers. > >> > >> Alex > >> > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <[EMAIL PROTECTED] > >> >wrote: > >> > >> > Hi, > >> > > >> > I am using hadoop framework for writing MapReduce jobs. I want to > >> redirect > >> > the output of Map into files of my choice and later use those files as > >> > input > >> > for Reduce phase. > >> > > >> > > >> > Could you please suggest, how to proceed for it ? > >> > > >> > thanks, > >> > --PB. > >> > > >> > > >
-
Re: Intermediate files generated.
Jones, Nick 2010-07-02, 11:05
Hi Pramy, I would setup one M/R job to just map (setNumReducers=0) and chain another job that uses a unity mapper to pass the intermediate data to the reduce step.
Nick Sent by radiation.
----- Original Message ----- From: Pramy Bhats <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Sent: Fri Jul 02 01:05:25 2010 Subject: Re: Intermediate files generated.
Hi Hemanth,
I need to use the output of the mapper for some other application. As a result, if I can redirect the output of the map in temp files of my choice (which are stored on hdfs) then i can reuse the output later. At the same time, the succeeding reducer can read the input from this temp files without any overhead.
thanks, --PB
On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]> wrote:
> Alex, > > > I don't think this is what I am looking for. Essential, I wish to run > both > > mapper as well as reducer. But at the same time, i wish to make sure that > > the temp files that are used between mappers and reducers are of my > choice. > > Here, the choice means that I can specify the files in HDFS that can be > used > > as temp files. > > Could you explain why you want to do this ? > > > > > thanks, > > --PB. > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED]> > wrote: > > > >> You could use the HDFS API from within your mapper, and run with 0 > >> reducers. > >> > >> Alex > >> > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <[EMAIL PROTECTED] > >> >wrote: > >> > >> > Hi, > >> > > >> > I am using hadoop framework for writing MapReduce jobs. I want to > >> redirect > >> > the output of Map into files of my choice and later use those files as > >> > input > >> > for Reduce phase. > >> > > >> > > >> > Could you please suggest, how to proceed for it ? > >> > > >> > thanks, > >> > --PB. > >> > > >> > > >
-
Re: Intermediate files generated.
Pramy Bhats 2010-07-02, 12:17
Hi,
Isn't possible to hack-in the intermediate files generated ?
I am writing a compilation framework, so i dont want to mess up with existing programming framework. The upper layer or the programmer should write the program the way he should write, and I want to leverage the intermediate file generated for my analysis.
thanks, --PB.
On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <[EMAIL PROTECTED]> wrote:
> Hi Pramy, > I would setup one M/R job to just map (setNumReducers=0) and chain another > job that uses a unity mapper to pass the intermediate data to the reduce > step. > > Nick > Sent by radiation. > > ----- Original Message ----- > From: Pramy Bhats <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Sent: Fri Jul 02 01:05:25 2010 > Subject: Re: Intermediate files generated. > > Hi Hemanth, > > I need to use the output of the mapper for some other application. As a > result, if I can redirect the output of the map in temp files of my choice > (which are stored on hdfs) then i can reuse the output later. At the same > time, the succeeding reducer can read the input from this temp files > without > any overhead. > > thanks, > --PB > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]> > wrote: > > > Alex, > > > > > I don't think this is what I am looking for. Essential, I wish to run > > both > > > mapper as well as reducer. But at the same time, i wish to make sure > that > > > the temp files that are used between mappers and reducers are of my > > choice. > > > Here, the choice means that I can specify the files in HDFS that can be > > used > > > as temp files. > > > > Could you explain why you want to do this ? > > > > > > > > thanks, > > > --PB. > > > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED]> > > wrote: > > > > > >> You could use the HDFS API from within your mapper, and run with 0 > > >> reducers. > > >> > > >> Alex > > >> > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats < > [EMAIL PROTECTED] > > >> >wrote: > > >> > > >> > Hi, > > >> > > > >> > I am using hadoop framework for writing MapReduce jobs. I want to > > >> redirect > > >> > the output of Map into files of my choice and later use those files > as > > >> > input > > >> > for Reduce phase. > > >> > > > >> > > > >> > Could you please suggest, how to proceed for it ? > > >> > > > >> > thanks, > > >> > --PB. > > >> > > > >> > > > > > > >
-
Re: Intermediate files generated.
Ken Goodhope 2010-07-02, 20:19
You could also use multi output from the old api. This will allow you to create multiple output collectors. One collector could be used at the beginning of the reduce call for writing the key-value pairing unaltered, and another collector for writing the results of your processing.
On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <[EMAIL PROTECTED]>wrote:
> Hi, > > Isn't possible to hack-in the intermediate files generated ? > > I am writing a compilation framework, so i dont want to mess up with > existing programming framework. The upper layer or the programmer should > write the program the way he should write, and I want to leverage the > intermediate file generated for my analysis. > > thanks, > --PB. > > On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <[EMAIL PROTECTED]> wrote: > > > Hi Pramy, > > I would setup one M/R job to just map (setNumReducers=0) and chain > another > > job that uses a unity mapper to pass the intermediate data to the reduce > > step. > > > > Nick > > Sent by radiation. > > > > ----- Original Message ----- > > From: Pramy Bhats <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > > Sent: Fri Jul 02 01:05:25 2010 > > Subject: Re: Intermediate files generated. > > > > Hi Hemanth, > > > > I need to use the output of the mapper for some other application. As a > > result, if I can redirect the output of the map in temp files of my > choice > > (which are stored on hdfs) then i can reuse the output later. At the same > > time, the succeeding reducer can read the input from this temp files > > without > > any overhead. > > > > thanks, > > --PB > > > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]> > > wrote: > > > > > Alex, > > > > > > > I don't think this is what I am looking for. Essential, I wish to run > > > both > > > > mapper as well as reducer. But at the same time, i wish to make sure > > that > > > > the temp files that are used between mappers and reducers are of my > > > choice. > > > > Here, the choice means that I can specify the files in HDFS that can > be > > > used > > > > as temp files. > > > > > > Could you explain why you want to do this ? > > > > > > > > > > > thanks, > > > > --PB. > > > > > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <[EMAIL PROTECTED] > > > > > wrote: > > > > > > > >> You could use the HDFS API from within your mapper, and run with 0 > > > >> reducers. > > > >> > > > >> Alex > > > >> > > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats < > > [EMAIL PROTECTED] > > > >> >wrote: > > > >> > > > >> > Hi, > > > >> > > > > >> > I am using hadoop framework for writing MapReduce jobs. I want to > > > >> redirect > > > >> > the output of Map into files of my choice and later use those > files > > as > > > >> > input > > > >> > for Reduce phase. > > > >> > > > > >> > > > > >> > Could you please suggest, how to proceed for it ? > > > >> > > > > >> > thanks, > > > >> > --PB. > > > >> > > > > >> > > > > > > > > > > > >
-
Re: Intermediate files generated.
Pramy Bhats 2010-07-08, 23:29
Correct me, If I am wrong. The output of the Mappers go to local file system. And reducers, later fetches the output of Mappers.
If the above statements is correct, can we specify the file of our choice to write the mappers out in desired location ?
thanks, --Paul
On Fri, Jul 2, 2010 at 10:19 PM, Ken Goodhope <[EMAIL PROTECTED]> wrote:
> You could also use multi output from the old api. This will allow you to > create multiple output collectors. One collector could be used at > the beginning of the reduce call for writing the key-value pairing > unaltered, and another collector for writing the results of your > processing. > > On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <[EMAIL PROTECTED] > >wrote: > > > Hi, > > > > Isn't possible to hack-in the intermediate files generated ? > > > > I am writing a compilation framework, so i dont want to mess up with > > existing programming framework. The upper layer or the programmer should > > write the program the way he should write, and I want to leverage the > > intermediate file generated for my analysis. > > > > thanks, > > --PB. > > > > On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <[EMAIL PROTECTED]> wrote: > > > > > Hi Pramy, > > > I would setup one M/R job to just map (setNumReducers=0) and chain > > another > > > job that uses a unity mapper to pass the intermediate data to the > reduce > > > step. > > > > > > Nick > > > Sent by radiation. > > > > > > ----- Original Message ----- > > > From: Pramy Bhats <[EMAIL PROTECTED]> > > > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > > > Sent: Fri Jul 02 01:05:25 2010 > > > Subject: Re: Intermediate files generated. > > > > > > Hi Hemanth, > > > > > > I need to use the output of the mapper for some other application. As a > > > result, if I can redirect the output of the map in temp files of my > > choice > > > (which are stored on hdfs) then i can reuse the output later. At the > same > > > time, the succeeding reducer can read the input from this temp files > > > without > > > any overhead. > > > > > > thanks, > > > --PB > > > > > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]> > > > wrote: > > > > > > > Alex, > > > > > > > > > I don't think this is what I am looking for. Essential, I wish to > run > > > > both > > > > > mapper as well as reducer. But at the same time, i wish to make > sure > > > that > > > > > the temp files that are used between mappers and reducers are of my > > > > choice. > > > > > Here, the choice means that I can specify the files in HDFS that > can > > be > > > > used > > > > > as temp files. > > > > > > > > Could you explain why you want to do this ? > > > > > > > > > > > > > > thanks, > > > > > --PB. > > > > > > > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard < > [EMAIL PROTECTED] > > > > > > > wrote: > > > > > > > > > >> You could use the HDFS API from within your mapper, and run with 0 > > > > >> reducers. > > > > >> > > > > >> Alex > > > > >> > > > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats < > > > [EMAIL PROTECTED] > > > > >> >wrote: > > > > >> > > > > >> > Hi, > > > > >> > > > > > >> > I am using hadoop framework for writing MapReduce jobs. I want > to > > > > >> redirect > > > > >> > the output of Map into files of my choice and later use those > > files > > > as > > > > >> > input > > > > >> > for Reduce phase. > > > > >> > > > > > >> > > > > > >> > Could you please suggest, how to proceed for it ? > > > > >> > > > > > >> > thanks, > > > > >> > --PB. > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > >
-
Re: Intermediate files generated.
Ted Yu 2010-07-08, 23:40
The first part of the statements isn't necessarily correct - SequenceFile is written to hdfs.
On Thu, Jul 8, 2010 at 4:29 PM, Pramy Bhats <[EMAIL PROTECTED]>wrote:
> Correct me, If I am wrong. The output of the Mappers go to local file > system. And reducers, later fetches the output of Mappers. > > If the above statements is correct, can we specify the file of our choice > to > write the mappers out in desired location ? > > thanks, > --Paul > > On Fri, Jul 2, 2010 at 10:19 PM, Ken Goodhope <[EMAIL PROTECTED]> > wrote: > > > You could also use multi output from the old api. This will allow you to > > create multiple output collectors. One collector could be used at > > the beginning of the reduce call for writing the key-value pairing > > unaltered, and another collector for writing the results of your > > processing. > > > > On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <[EMAIL PROTECTED] > > >wrote: > > > > > Hi, > > > > > > Isn't possible to hack-in the intermediate files generated ? > > > > > > I am writing a compilation framework, so i dont want to mess up with > > > existing programming framework. The upper layer or the programmer > should > > > write the program the way he should write, and I want to leverage the > > > intermediate file generated for my analysis. > > > > > > thanks, > > > --PB. > > > > > > On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <[EMAIL PROTECTED]> > wrote: > > > > > > > Hi Pramy, > > > > I would setup one M/R job to just map (setNumReducers=0) and chain > > > another > > > > job that uses a unity mapper to pass the intermediate data to the > > reduce > > > > step. > > > > > > > > Nick > > > > Sent by radiation. > > > > > > > > ----- Original Message ----- > > > > From: Pramy Bhats <[EMAIL PROTECTED]> > > > > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > > > > Sent: Fri Jul 02 01:05:25 2010 > > > > Subject: Re: Intermediate files generated. > > > > > > > > Hi Hemanth, > > > > > > > > I need to use the output of the mapper for some other application. As > a > > > > result, if I can redirect the output of the map in temp files of my > > > choice > > > > (which are stored on hdfs) then i can reuse the output later. At the > > same > > > > time, the succeeding reducer can read the input from this temp files > > > > without > > > > any overhead. > > > > > > > > thanks, > > > > --PB > > > > > > > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > Alex, > > > > > > > > > > > I don't think this is what I am looking for. Essential, I wish to > > run > > > > > both > > > > > > mapper as well as reducer. But at the same time, i wish to make > > sure > > > > that > > > > > > the temp files that are used between mappers and reducers are of > my > > > > > choice. > > > > > > Here, the choice means that I can specify the files in HDFS that > > can > > > be > > > > > used > > > > > > as temp files. > > > > > > > > > > Could you explain why you want to do this ? > > > > > > > > > > > > > > > > > thanks, > > > > > > --PB. > > > > > > > > > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard < > > [EMAIL PROTECTED] > > > > > > > > > wrote: > > > > > > > > > > > >> You could use the HDFS API from within your mapper, and run with > 0 > > > > > >> reducers. > > > > > >> > > > > > >> Alex > > > > > >> > > > > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats < > > > > [EMAIL PROTECTED] > > > > > >> >wrote: > > > > > >> > > > > > >> > Hi, > > > > > >> > > > > > > >> > I am using hadoop framework for writing MapReduce jobs. I want > > to > > > > > >> redirect > > > > > >> > the output of Map into files of my choice and later use those > > > files > > > > as > > > > > >> > input > > > > > >> > for Reduce phase. > > > > > >> > > > > > > >> > > > > > > >> > Could you please suggest, how to proceed for it ? > > > > > >> > > > > > > >> > thanks, > > > > > >> > --PB. > > > > > >> > > > > > > >> > > > > > > >
|
|