Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # dev - Review Request 15194: Support multiple inputs for PigProcessor


+
Mark Wagner 2013-11-02, 01:17
+
Mark Wagner 2013-11-02, 01:17
+
Cheolsoo Park 2013-11-03, 17:41
+
Cheolsoo Park 2013-11-03, 20:31
Copy link to this message
-
Re: Review Request 15194: Support multiple inputs for PigProcessor
Daniel Dai 2013-11-05, 07:29

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15194/#review28179
-----------------------------------------------------------

src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
<https://reviews.apache.org/r/15194/#comment54825>

    Need to take care of "pig.sortOrder"?

src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
<https://reviews.apache.org/r/15194/#comment54826>

    We don't tag the input, so no index is needed. We might be able to further optimize this (but I am fine to leave it to the future).

src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
<https://reviews.apache.org/r/15194/#comment54824>

    Now we no longer tag input, so no need to specify a group comparator (in old group comparator, we skip tag when compare, to make sure same key (even different tag) is grouped together.
- Daniel Dai
On Nov. 2, 2013, 1:17 a.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15194/
> -----------------------------------------------------------
>
> (Updated Nov. 2, 2013, 1:17 a.m.)
>
>
> Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
>
>
> Bugs: PIG-3527
>     https://issues.apache.org/jira/browse/PIG-3527
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Adds support for multiple LogicalInputs to the PigProcessor. This is done by adding a new TezLoad interface which PhysicalOperators may implement. On the backend, any operators implementing this interface will have the LogicalInput attached to them. 2 implementations are included:
> * POSimpleTezLoad which consumes a single MRInput
> * POShuffleTezLoad which consumes one or more ShuffledMergedInputs.
> The POShuffleTezLoad does a k-way merge of the shuffle inputs to package for the operator pipeline. This required a change to the comparators used so that the sort order remained consistent. There is also a fix to POForEach where it was using the incorrect status code for signaling (although it produced the same end result in the MR pipeline).
>
>
> Diffs
> -----
>
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigDecimalRawComparator.java ddea99e
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigIntegerRawComparator.java 5ea3fc7
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBooleanRawComparator.java dfd4ebf
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java 09397e5
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDateTimeRawComparator.java a87161f
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDoubleRawComparator.java cbf457f
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigFloatRawComparator.java 1d86e3f
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigIntRawComparator.java bb6c9df
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigLongRawComparator.java b3ded76
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigSecondaryKeyComparator.java 5ad334b
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTextRawComparator.java 022f37b
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java 866c39d
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleSortComparator.java 9724b9f
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/POSimpleTezLoad.java PRE-CREATION
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/TezLoad.java PRE-CREATION
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java eb9f62a
+
Daniel Dai 2013-11-05, 16:57
+
Mark Wagner 2013-11-28, 00:41