|
|
+
Bertrand Dechoux 2012-11-24, 11:56
+
Sean McNamara 2012-11-23, 22:22
-
Re: Multi-stage map/reduce jobsJay Vyas 2012-11-23, 22:50
Hadoop is not an API for orchestrating mapreduce jobs- fortunately, there is no need for such an API. Each mapreduce job can simple be run like a normal java class.
So, to run multiple mapreduce jobs? Easy- you create a main()[] method in a single class which runs each job individually by invoking each job separately, using the waitForCompletion() method which blocks until a job completes. ..this method will block until each individual job completes. Jay Vyas http://jayunit100.blogspot.com On Nov 23, 2012, at 5:22 PM, Sean McNamara <[EMAIL PROTECTED]> wrote: > It's not clear to me how to stitch together multiple map reduce jobs. Without using cascading or something else like it, is the method basically to write to a intermediate spot, and have the next stage read from there? > > If so, how are jobs responsible for cleaning up the temp/intermediate data they create? What happens if stage 1 completes, and state 2 doesn't, do the stage 1 files get left around? > > Does anyone have some insight they could share? > > Thanks. |