Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Review Request 15031: PIG-3541 Add diagnostic information to TezStats


Copy link to this message
-
Review Request 15031: PIG-3541 Add diagnostic information to TezStats

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15031/
-----------------------------------------------------------

Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
Bugs: PIG-3541
    https://issues.apache.org/jira/browse/PIG-3541
Repository: pig-git
Description
-------

This patch includes the following changes:
* Implement Input/OutputStats for Tez. (This makes DUMP work.) As of now, counters cannot be retrieved from Tez DAG, so only filenames are reported.
* Add the error message from DAGStatus.getDiagnostic() for failed DAG. As of now, backend error messages or stack traces cannot be retrieved from Tez DAG, so only the id of failed vertex is reported.
* Factor out a few methods/fields that can be used by both MR and Tez into PigStats. Duplicate code between SimplePigStats and TezStats is minimal now.
Diffs
-----

  src/org/apache/pig/backend/hadoop/executionengine/tez/TezJob.java 14da46e
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java 9d93968
  src/org/apache/pig/tools/pigstats/JobStats.java 5eac24b
  src/org/apache/pig/tools/pigstats/PigStats.java e2eba6d
  src/org/apache/pig/tools/pigstats/mapreduce/MRJobStats.java 1a37848
  src/org/apache/pig/tools/pigstats/mapreduce/MRPigStatsUtil.java 4bdcf19
  src/org/apache/pig/tools/pigstats/mapreduce/SimplePigStats.java 5088563
  src/org/apache/pig/tools/pigstats/tez/TezStats.java b0d7f45
  src/org/apache/pig/tools/pigstats/tez/TezTaskStats.java bd45d8f
  test/org/apache/pig/tez/TestTezLauncher.java 8382a7d

Diff: https://reviews.apache.org/r/15031/diff/
Testing
-------

* Updated TestTezLauncher by adding asserts for input/output stats.
* Ran ant test-tez.
* Verified reports for succeeded/failed DAGs-

  Success!
            Input(s): Successfully read records from: "hdfs://localhost:57063/user/cheolsoop/foo"                        
           Output(s): Successfully stored records in: "/user/cheolsoop/13"

  Failed!
        ErrorMessage: Vertex failed vertex_1383071498815_0006_1_01                                                        
                    : DAG failed due to vertex failure. failedVertices:1 killedVertices:0                                

            Input(s): Failed to read data from "hdfs://localhost:57063/user/cheolsoop/foo"                                
           Output(s): Failed to produce result in "/user/cheolsoop/14"
Thanks,

Cheolsoo Park