Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Review Request 14504: PIG-3500 Initial implementation of TezCompiler


Copy link to this message
-
Review Request 14504: PIG-3500 Initial implementation of TezCompiler
Cheolsoo Park 2013-10-05, 00:21

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14504/
-----------------------------------------------------------

Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
Bugs: PIG-3500
    https://issues.apache.org/jira/browse/PIG-3500
Repository: pig-git
Description
-------

Initial implementation of TezCompiler that converts physical plan into tez plan. This version works only for basic operators including LOAD, STORE, FILTER, FOREACH, GROUP, and JOIN.

Here is an example:

a = load '/tmp/input' as (name, age, gpa);
b = filter a by age>=30;
c = group b by age;
d = foreach c generate group as age, COUNT(b);
e = load '/tmp/fact' as (age, comments);
f = join d by age, e by age;
store f into '/tmp/output';

>> pig -x tez -e 'explain -script test.pig'

#--------------------------------------------------
# TEZ plan:
#--------------------------------------------------
Tez vertex scope-29
c: Local Rearrange[tuple]{bytearray}(false) - scope-8
|   |
|   Project[bytearray][1] - scope-9
|
|---b: Filter[bag] - scope-1
    |   |
    |   Greater Than or Equal[boolean] - scope-5
    |   |
    |   |---Cast[int] - scope-3
    |   |   |
    |   |   |---Project[bytearray][1] - scope-2
    |   |
    |   |---Constant(30) - scope-4
    |
    |---a: Load(/tmp/input:org.apache.pig.builtin.PigStorage) - scope-0
Tez vertex scope-30
f: Local Rearrange[tuple]{bytearray}(false) - scope-21
|   |
|   Project[bytearray][0] - scope-22
|
|---d: New For Each(false,false)[bag] - scope-15
    |   |
    |   Project[bytearray][0] - scope-10
    |   |
    |   POUserFunc(org.apache.pig.builtin.COUNT)[long] - scope-13
    |   |
    |   |---Project[bag][1] - scope-12
    |
    |---c: Package[tuple]{bytearray} - scope-7
Tez vertex scope-31
f: Local Rearrange[tuple]{bytearray}(false) - scope-23
|   |
|   Project[bytearray][0] - scope-24
|
|---e: Load(/tmp/fact:org.apache.pig.builtin.PigStorage) - scope-16
Tez vertex scope-32
f: Store(/tmp/output:org.apache.pig.builtin.PigStorage) - scope-28
|
|---f: New For Each(true,true)[tuple] - scope-27
    |   |
    |   Project[bag][1] - scope-25
    |   |
    |   Project[bag][2] - scope-26
    |
    |---f: Package[tuple]{bytearray} - scope-20
Diffs
-----

  ivy.xml 7163c89
  ivy/libraries.properties ea08384
  src/org/apache/pig/backend/hadoop/executionengine/tez/MapOper.java 623ec95
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/tez/ReduceOper.java ef5fe84
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java a4c9c59
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerException.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecType.java 1d90f95
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecutionEngine.java 5e9caf6
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java e182f0d
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java ca06151
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java 5d68a85
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezScriptState.java PRE-CREATION
  src/org/apache/pig/impl/PigContext.java 1b6ac61
  test/org/apache/pig/test/TestMRCompiler.java 8c85280
  test/org/apache/pig/test/Util.java a2bc1cf
  test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld PRE-CREATION
  test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld PRE-CREATION
  test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld PRE-CREATION
  test/org/apache/pig/tez/TestTezCompiler.java PRE-CREATION

Diff: https://reviews.apache.org/r/14504/diff/
Testing
-------

Added unit tests cases to TestTezCompiler. Note that this patch requires the lasted version (trunk) of Apache Tez.
Thanks,

Cheolsoo Park