Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Review Request 14679: Initial implementation of PigProcessor


Copy link to this message
-
Review Request 14679: Initial implementation of PigProcessor
Mark Wagner 2013-10-16, 19:12

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14679/
-----------------------------------------------------------

Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
Bugs: PIG-3521
    https://issues.apache.org/jira/browse/PIG-3521
Repository: pig-git
Description
-------

This patch adds the PigProcessor and related changes. The current patch supports MR* jobs.

* Updates the Tez dependency to match Tez's trunk
* Add PigProcessor which roughly follows the existing Mappers and Reducers in Pig.
* The handling of input has been factored out of the PigProcessor into a new interface: InputHandler. Two implementations of InputHandler have been added: FileInputHandler and ShuffledInputHandler.
* Makes changes to TezDagBuilder to serialize and ship the necessary information from the frontend. These changes are mostly inspired by/stolen from the JobControlCompiler.
* Adds a TezPOPackageAnnotator which is analogous to the POPackageAnnotator, but for Tez.
* Fixes a problem with edge creation in the TezDagBuilder.
Diffs
-----

  ivy.xml c603def
  src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 6724f2b
  src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 48c0955
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java 05b0c54
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 4cc9ab4
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java PRE-CREATION

Diff: https://reviews.apache.org/r/14679/diff/
Testing
-------

Only integration testing has been done. Jobs with 1, 2, and 3 stages have been executed successfully. I'll be adding unit tests.
Thanks,

Mark Wagner