Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Intermediate files generated.


Copy link to this message
-
Re: Intermediate files generated.
The first part of the statements isn't necessarily correct - SequenceFile is
written to hdfs.

On Thu, Jul 8, 2010 at 4:29 PM, Pramy Bhats <[EMAIL PROTECTED]>wrote:

> Correct me, If I am wrong. The output of the Mappers go to local file
> system. And reducers, later fetches the output of Mappers.
>
> If the above statements is correct, can we specify the file of our choice
> to
> write the mappers out in desired location ?
>
> thanks,
> --Paul
>
> On Fri, Jul 2, 2010 at 10:19 PM, Ken Goodhope <[EMAIL PROTECTED]>
> wrote:
>
> > You could also use multi output from the old api.  This will allow you to
> > create multiple output collectors.  One collector could be used at
> > the beginning of the reduce call for writing the key-value pairing
> > unaltered, and another collector for writing the results of your
> > processing.
> >
> > On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi,
> > >
> > > Isn't possible to hack-in the intermediate files generated ?
> > >
> > > I am writing a compilation framework, so i dont want to mess up with
> > > existing programming framework. The upper layer or the programmer
> should
> > > write the program the way he should write, and I want to leverage the
> > > intermediate file generated for my analysis.
> > >
> > > thanks,
> > > --PB.
> > >
> > > On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Hi Pramy,
> > > > I would setup one M/R job to just map (setNumReducers=0) and chain
> > > another
> > > > job that uses a unity mapper to pass the intermediate data to the
> > reduce
> > > > step.
> > > >
> > > > Nick
> > > > Sent by radiation.
> > > >
> > > > ----- Original Message -----
> > > > From: Pramy Bhats <[EMAIL PROTECTED]>
> > > > To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> > > > Sent: Fri Jul 02 01:05:25 2010
> > > > Subject: Re: Intermediate files generated.
> > > >
> > > > Hi Hemanth,
> > > >
> > > > I need to use the output of the mapper for some other application. As
> a
> > > > result, if I can redirect the output of the map in temp files of my
> > > choice
> > > > (which are stored on hdfs) then i can reuse the output later. At the
> > same
> > > > time, the succeeding reducer can read the input from this temp files
> > > > without
> > > > any overhead.
> > > >
> > > > thanks,
> > > > --PB
> > > >
> > > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > >
> > > > > Alex,
> > > > >
> > > > > > I don't think this is what I am looking for. Essential, I wish to
> > run
> > > > > both
> > > > > > mapper as well as reducer. But at the same time, i wish to make
> > sure
> > > > that
> > > > > > the temp files that are used between mappers and reducers are of
> my
> > > > > choice.
> > > > > > Here, the choice means that I can specify the files in HDFS that
> > can
> > > be
> > > > > used
> > > > > > as temp files.
> > > > >
> > > > > Could you explain why you want to do this ?
> > > > >
> > > > > >
> > > > > > thanks,
> > > > > > --PB.
> > > > > >
> > > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <
> > [EMAIL PROTECTED]
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> You could use the HDFS API from within your mapper, and run with
> 0
> > > > > >> reducers.
> > > > > >>
> > > > > >> Alex
> > > > > >>
> > > > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <
> > > > [EMAIL PROTECTED]
> > > > > >> >wrote:
> > > > > >>
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > I am using hadoop framework for writing MapReduce jobs. I want
> >  to
> > > > > >> redirect
> > > > > >> > the output of Map into files of my choice and later use those
> > > files
> > > > as
> > > > > >> > input
> > > > > >> > for Reduce phase.
> > > > > >> >
> > > > > >> >
> > > > > >> > Could you please suggest, how to proceed for it ?
> > > > > >> >
> > > > > >> > thanks,
> > > > > >> > --PB.
> > > > > >> >
> > > > > >>
> > > > > >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB