Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> About full pipeline between pig jobs

Copy link to this message
About full pipeline between pig jobs

I wonder if M/R jobs compiled from pig script support pipeline between jobs.

For example, let's assume there  are 5 independent consecutive M/R jobs
doing some joining and aggregating task.
My question is can one job be started before it's previous job finished so
that the previous job doesn't need to write all the output data from reduce
 to HDFS , I just can't find any material talking about this.

I think  Abinitio is a good example for the full pipeline architecture.

Thanks & Regards