Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> How do I synchronize Hadoop jobs?

W.P. McNeill 2012-02-15, 19:23
Copy link to this message
Re: How do I synchronize Hadoop jobs?
You can use Oozie for that, you can write a workflow job that forks A
& B and then joins before C.



On Wed, Feb 15, 2012 at 11:23 AM, W.P. McNeill <[EMAIL PROTECTED]> wrote:
> Say I have two Hadoop jobs, A and B, that can be run in parallel. I have
> another job, C, that takes the output of both A and B as input. I want to
> run A and B at the same time, wait until both have finished, and then
> launch C. What is the best way to do this?
> I know the answer if I've got a single Java client program that launches A,
> B, and C. But what if I don't have the option to launch all of them from a
> single Java program? (Say I've got a much more complicated system with many
> steps happening between A-B and C.) How do I synchronize between jobs, make
> sure there's no race conditions etc. Is this what Zookeeper is for?
John Armstrong 2012-02-15, 19:26
Bharath Mundlapudi 2012-02-15, 21:29
Bharath Mundlapudi 2012-02-15, 21:31
bejoy.hadoop@... 2012-02-15, 19:28