Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Review Request 14679: Initial implementation of PigProcessor


Copy link to this message
-
Re: Review Request 14679: Initial implementation of PigProcessor

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14679/#review27104
-----------------------------------------------------------

Ship it!
Simple load/store works for me (with some minor fix, and frontend throw exception after job finish though). Still trying complex queries. But we can commit this patch first and fix based on it.

- Daniel Dai
On Oct. 16, 2013, 7:12 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14679/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2013, 7:12 p.m.)
>
>
> Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
>
>
> Bugs: PIG-3521
>     https://issues.apache.org/jira/browse/PIG-3521
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> This patch adds the PigProcessor and related changes. The current patch supports MR* jobs.
>
> * Updates the Tez dependency to match Tez's trunk
> * Add PigProcessor which roughly follows the existing Mappers and Reducers in Pig.
> * The handling of input has been factored out of the PigProcessor into a new interface: InputHandler. Two implementations of InputHandler have been added: FileInputHandler and ShuffledInputHandler.
> * Makes changes to TezDagBuilder to serialize and ship the necessary information from the frontend. These changes are mostly inspired by/stolen from the JobControlCompiler.
> * Adds a TezPOPackageAnnotator which is analogous to the POPackageAnnotator, but for Tez.
> * Fixes a problem with edge creation in the TezDagBuilder.
>
>
> Diffs
> -----
>
>   ivy.xml c603def
>   src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java PRE-CREATION
>   src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java PRE-CREATION
>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 6724f2b
>   src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java PRE-CREATION
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 48c0955
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java 05b0c54
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 4cc9ab4
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/14679/diff/
>
>
> Testing
> -------
>
> Only integration testing has been done. Jobs with 1, 2, and 3 stages have been executed successfully. I'll be adding unit tests.
>
>
> Thanks,
>
> Mark Wagner
>
>