Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> How do I synchronize Hadoop jobs?


+
W.P. McNeill 2012-02-15, 19:23
Copy link to this message
-
Re: How do I synchronize Hadoop jobs?
You can use Oozie for that, you can write a workflow job that forks A
& B and then joins before C.

Thanks.

Alejandro

On Wed, Feb 15, 2012 at 11:23 AM, W.P. McNeill <[EMAIL PROTECTED]> wrote:
> Say I have two Hadoop jobs, A and B, that can be run in parallel. I have
> another job, C, that takes the output of both A and B as input. I want to
> run A and B at the same time, wait until both have finished, and then
> launch C. What is the best way to do this?
>
> I know the answer if I've got a single Java client program that launches A,
> B, and C. But what if I don't have the option to launch all of them from a
> single Java program? (Say I've got a much more complicated system with many
> steps happening between A-B and C.) How do I synchronize between jobs, make
> sure there's no race conditions etc. Is this what Zookeeper is for?
+
John Armstrong 2012-02-15, 19:26
+
Bharath Mundlapudi 2012-02-15, 21:29
+
Bharath Mundlapudi 2012-02-15, 21:31
+
bejoy.hadoop@... 2012-02-15, 19:28
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB