-About full pipeline between pig jobs
W W 2012-10-22, 10:34
I wonder if M/R jobs compiled from pig script support pipeline between jobs.
For example, let's assume there are 5 independent consecutive M/R jobs
doing some joining and aggregating task.
My question is can one job be started before it's previous job finished so
that the previous job doesn't need to write all the output data from reduce
to HDFS , I just can't find any material talking about this.
I think Abinitio is a good example for the full pipeline architecture.
Thanks & Regards