Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # dev - Review Request: disable optimizations via pig properties


+
Travis Crawford 2013-05-09, 21:03
+
Bill Graham 2013-05-09, 22:40
+
Travis Crawford 2013-05-10, 18:26
+
Travis Crawford 2013-05-13, 20:35
+
Travis Crawford 2013-05-13, 21:18
+
Bill Graham 2013-05-13, 23:35
+
Bill Graham 2013-05-14, 00:31
Copy link to this message
-
Re: Review Request: disable optimizations via pig properties
Travis Crawford 2013-05-14, 00:00


> On May 13, 2013, 11:35 p.m., Bill Graham wrote:
> > src/docs/src/documentation/content/xdocs/perf.xml, line 493
> > <https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493>
> >
> >     Would you please specify that setting this value in both the pig properties file and the command line (or script) will be additive.

Currently it works like this:

(a) -optimizer_off command-line rules are always disabled.
(b) The "pig.optimizer.rules.disabled" property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties).

Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect.

I think this makes sense for how people will want to use the feature (and I think is consistent with how other properties work).

* Site administrators can specify default rules to disable via pig.properties.
* Individual scripts can override the site defaults if needed.
* Invokers of pig can supplement the rules to disable.

Thoughts? If we want to be additive within (b) we'd also need a way to remove defaults set by site administrators, since the default should be a suggestion not requirement. That would easily be achieved with a "-" prefix that would remove disabled rules, but I think we've covered the common use-cases without introducing extra complexity.
- Travis
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20516
-----------------------------------------------------------
On May 13, 2013, 8:35 p.m., Travis Crawford wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11032/
> -----------------------------------------------------------
>
> (Updated May 13, 2013, 8:35 p.m.)
>
>
> Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
>
>
> Description
> -------
>
> Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set in pig.properties, "set" commands in scripts themselves, and command-line -D options.
>
> The use-case is, for scripts that require certain optimizations to be disabled, allowing the script itself to disable the optimization. Currently whatever runs the script needs to specially handle disabling the optimization for that specific query.
>
>
> This addresses bug PIG-3317.
>     https://issues.apache.org/jira/browse/PIG-3317
>
>
> Diffs
> -----
>
>   src/docs/src/documentation/content/xdocs/perf.xml 108ae7e
>   src/org/apache/pig/Main.java f97ed9f
>   src/org/apache/pig/PigConstants.java ea77e97
>   src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 4dab4e8
>   src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java d26f381
>   test/org/apache/pig/test/TestEvalPipeline2.java 39cf807
>
> Diff: https://reviews.apache.org/r/11032/diff/
>
>
> Testing
> -------
>
> Manually tested on a fully-distributed cluster.
>
> THIS FAILS:
> PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig
>
> THIS WORKS:
> PIG_CONF_DIR=/etc/pig/conf ./bin/pig -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig
>
> Notice how "-Dpig.optimizer.rules.disabled=ColumnMapKeyPrune" specifies a pig property, which could be in pig.properties, or the script itself.
>
>
> Failure message:
>
> Pig Stack Trace
> ---------------
> ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1)
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias null
> at org.apache.pig.PigServer.explain(PigServer.java:1057)
> at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419)
+
Bill Graham 2013-05-14, 15:00
+
Travis Crawford 2013-05-14, 02:19
+
Travis Crawford 2013-05-14, 03:45
+
Bill Graham 2013-05-14, 03:39
+
Julien Le Dem 2013-05-13, 23:23
+
Travis Crawford 2013-05-13, 23:50
+
Travis Crawford 2013-05-13, 23:47
+
Julien Le Dem 2013-05-14, 15:04
+
Travis Crawford 2013-05-14, 17:23
+
Bill Graham 2013-05-14, 22:06
+
Julien Le Dem 2013-05-14, 23:13
+
Travis Crawford 2013-05-16, 00:07