Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Review Request 16928: PIG-3463 Pig should use hadoop local mode for small jobs


Copy link to this message
-
Re: Review Request 16928: PIG-3463 Pig should use hadoop local mode for small jobs

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16928/#review32458
-----------------------------------------------------------

Ship it!
Ship It!

- Cheolsoo Park
On Jan. 21, 2014, 2:52 a.m., Aniket Mokashi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16928/
> -----------------------------------------------------------
>
> (Updated Jan. 21, 2014, 2:52 a.m.)
>
>
> Review request for pig, Cheolsoo Park, Daniel Dai, Dmitriy Ryaboy, and Julien Le Dem.
>
>
> Bugs: PIG-3463
>     https://issues.apache.org/jira/browse/PIG-3463
>
>
> Repository: pig
>
>
> Description
> -------
>
> If pig.auto.local.enabled is set, JCC will modify Configuration of all the jobs with one reducer and input size less than pig.auto.local.input.maxbytes, so that they are forced to run in local mode. Output of local run is also written to hdfs.
>
>
> Diffs
> -----
>
>   trunk/src/org/apache/pig/ExecTypeProvider.java 1558572
>   trunk/src/org/apache/pig/PigConfiguration.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceOper.java 1558572
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigInputFormat.java 1558572
>   trunk/src/org/apache/pig/impl/PigImplConstants.java 1558572
>   trunk/test/org/apache/pig/test/TestAutoLocalMode.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/16928/diff/
>
>
> Testing
> -------
>
> Tried few scenarios with the patch-
> Load small data, group all, count - works in local mode.
> Load small data, another small data and replicated join - works in local mode.
> Load small data and order by key - all 3 jobs work in local mode and .
> Load small data and large data for replicated join - first job runs in local mode, second runs in MR mode.
> Load large data and order by key - works in first stages in local mode and last stage in MR mode.
>
>
> Thanks,
>
> Aniket Mokashi
>
>