Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Programming Multiple rounds of mapreduce

Copy link to this message
Re: Programming Multiple rounds of mapreduce
Thanks Matt,

Arko, if you plan to use Oozie, you can have a simple coordinator job that
does does, for example (the following schedules a WF every 5 mins that
consumes the output produced by the previous run, you just have to have the
initial data)



<coordinator-app name="coord-1" frequency="${coord:minutes(5)}"
start="${start}" end="${end}" timezone="UTC"

    <dataset name="data" frequency="${coord:minutes(5)}"
initial-instance="${start}" timezone="UTC">


    <data-in name="input" dataset="data">

    <data-out name="output" dataset="data">



On Mon, Jun 13, 2011 at 3:01 PM, GOEKE, MATTHEW (AG/1000) <

> If you know for certain that it needs to be split into multiple work units
> I would suggest looking into Oozie. Easy to install, light weight, low
> learning curve... for my purposes it's been very helpful so far. I am also
> fairly certain you can chain multiple job confs into the same run but I have
> not actually tried that therefore I can't promise it is easy or possible.
> http://www.cloudera.com/blog/2010/07/whats-new-in-cdh3-b2-oozie/
> If you are not running CDH3u0 then you can also get the tarball and
> documentation directly here:
> https://ccp.cloudera.com/display/SUPPORT/CDH3+Downloadable+Tarballs
> Matt
> -----Original Message-----
> From: Marcos Ortiz [mailto:[EMAIL PROTECTED]]
> Sent: Monday, June 13, 2011 4:57 PM
> Cc: Arko Provo Mukherjee
> Subject: Re: Programming Multiple rounds of mapreduce
> Well, you can define a job for each round and then, you can define the
> running workflow based in your implementation and to chain your jobs
> El 6/13/2011 5:46 PM, Arko Provo Mukherjee escribió:
> > Hello,
> >
> > I am trying to write a program where I need to write multiple rounds
> > of map and reduce.
> >
> > The output of the last round of map-reduce must be fed into the input
> > of the next round.
> >
> > Can anyone please guide me to any link / material that can teach me as
> > to how I can achieve this.
> >
> > Thanks a lot in advance!
> >
> > Thanks & regards
> > Arko
> --
> Marcos Luís Ortíz Valmaseda
>  Software Engineer (UCI)
>  http://marcosluis2186.posterous.com
>  http://twitter.com/marcosluis2186
> This e-mail message may contain privileged and/or confidential information,
> and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use