Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Review Request 14274: PIG-2672 Optimize the use of DistributedCache


Copy link to this message
-
Re: Review Request 14274: PIG-2672 Optimize the use of DistributedCache

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14274/#review26364
-----------------------------------------------------------
There are some white spaces in the patch and code is not formatted. Noticed that many places have no space before and after operators like + (concat), !=, etc.
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/14274/#comment51488>

    If hdfs path use as is and do not ship to jar cache. It will also save time and hash checks.

trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/14274/#comment51492>

    Since the name of the file on hdfs is different from that of the actual file, create a symlink with the actual filename. Some users might depend on the actual file name.

trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/14274/#comment51489>

    First do a file size comparison before calculating checksum for better efficiency

trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/14274/#comment51494>

    Can write a PathFilter that matches the filesize and does a endsWith "-" + suffix.
    
    On the filtered list, then can apply the checksum name check.
- Rohini Palaniswamy
On Sept. 21, 2013, 1:21 a.m., Aniket Mokashi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14274/
> -----------------------------------------------------------
>
> (Updated Sept. 21, 2013, 1:21 a.m.)
>
>
> Review request for pig, Cheolsoo Park, DanielWX DanielWX, Dmitriy Ryaboy, Julien Le Dem, and Rohini Palaniswamy.
>
>
> Bugs: PIG-2672
>     https://issues.apache.org/jira/browse/PIG-2672
>
>
> Repository: pig
>
>
> Description
> -------
>
> added jar.cache.location option
>
>
> Diffs
> -----
>
>   trunk/src/org/apache/pig/PigConstants.java 1525188
>   trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 1525188
>   trunk/src/org/apache/pig/impl/PigContext.java 1525188
>   trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1525188
>   trunk/test/org/apache/pig/test/TestJobControlCompiler.java 1525188
>
> Diff: https://reviews.apache.org/r/14274/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Aniket Mokashi
>
>