Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Chaining MapReduce Jobs


Copy link to this message
-
Re: Chaining MapReduce Jobs
Have you looked at the ToolRunner class?

On Nov 8, 2012, at 7:03 AM, Claudio Reggiani <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I would like to run an Hadoop program which is composed by
> Map1-Red1->Map2-Red2->Map3-Red3. I've read "Hadoop in Action" and several
> articles online, but all of them are either based on API <= 0.20 or they
> have just few lines of code.
>
> I'm working with Hadoop 1.0.3 and I think the best solution is to use
> JobControl class, but I haven't found one good example for that.
>
> In my particular application the MapReduce Jobs are executed in sequence,
> so it could be possible to run the first job, then the second and finally
> the third one. The problem is that I need to set the input and output
> directory for the second but it doesn't make sense because I should link
> the output of job1 with the input of job2 and I don't know how to do that.
>
> Any suggestion or resource to solve this problem? Even a source code in
> github is good.
>
> Thanks
> Claudio
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB