Claudio Reggiani 2012-11-08, 13:03
-Re: Chaining MapReduce Jobs
Michael Segel 2012-11-08, 19:12
Have you looked at the ToolRunner class?
On Nov 8, 2012, at 7:03 AM, Claudio Reggiani <[EMAIL PROTECTED]> wrote:
> I would like to run an Hadoop program which is composed by
> Map1-Red1->Map2-Red2->Map3-Red3. I've read "Hadoop in Action" and several
> articles online, but all of them are either based on API <= 0.20 or they
> have just few lines of code.
> I'm working with Hadoop 1.0.3 and I think the best solution is to use
> JobControl class, but I haven't found one good example for that.
> In my particular application the MapReduce Jobs are executed in sequence,
> so it could be possible to run the first job, then the second and finally
> the third one. The problem is that I need to set the input and output
> directory for the second but it doesn't make sense because I should link
> the output of job1 with the input of job2 and I don't know how to do that.
> Any suggestion or resource to solve this problem? Even a source code in
> github is good.