Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Review Request 13781: Changes to add support for streaming_python udfs.


Copy link to this message
-
Re: Review Request 13781: Changes to add support for streaming_python udfs.
Daniel Dai 2013-09-07, 00:04

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13781/#review25964
-----------------------------------------------------------

src/org/apache/pig/PigToStream.java
<https://reviews.apache.org/r/13781/#comment50670>

    No code change, remove this section from patch

src/org/apache/pig/StreamToPig.java
<https://reviews.apache.org/r/13781/#comment50671>

    No code change, remove this section from patch

src/org/apache/pig/builtin/PigStreaming.java
<https://reviews.apache.org/r/13781/#comment50672>

    No code change, remove this section from patch

src/org/apache/pig/builtin/PigToStreamUDF.java
<https://reviews.apache.org/r/13781/#comment50675>

    I feel those classes StreamingUDF should go to package org.apache.pig.impl.builtin, StreamingDelimiters, StreamUDFToPig, PigToStreamUDF, etc should go to org.apache.pig.impl.streaming

src/org/apache/pig/builtin/PigToStreamUDF.java
<https://reviews.apache.org/r/13781/#comment50673>

    I personally like to move the last element processing to the for loop:
            for (int i=0; i < sz; i++) {
                field = t.get(i);
                StorageUtil.putField(out, field, DELIMS, true);
                if (i!=sz-1) {
                    out.write(DELIMS.getParamDelim());
                }
            }

src/org/apache/pig/builtin/StreamUDFToPig.java
<https://reviews.apache.org/r/13781/#comment50684>

    This might be affected by PIG-3255

src/org/apache/pig/builtin/StreamUDFToPig.java
<https://reviews.apache.org/r/13781/#comment50686>

    How do we handle delimit escaping? Seems we don't handle it right now?

src/org/apache/pig/builtin/StreamingUDF.java
<https://reviews.apache.org/r/13781/#comment50685>

    Are those files only for illustrate? Why not just use a random tmp file? Why do we distinguish exectype in illustrate? It only run locally.

src/org/apache/pig/builtin/StreamingUDF.java
<https://reviews.apache.org/r/13781/#comment50680>

    Any reason not to create a StreamingUDFInputHandler?

src/org/apache/pig/builtin/StreamingUDF.java
<https://reviews.apache.org/r/13781/#comment50679>

    Can we use FileUtils.copyFile?

src/org/apache/pig/builtin/StreamingUDF.java
<https://reviews.apache.org/r/13781/#comment50683>

    Seems flush() is more proper to put inside inputHandler

src/org/apache/pig/impl/streaming/InputHandler.java
<https://reviews.apache.org/r/13781/#comment50682>

    Why we need this change?

test/unit-tests
<https://reviews.apache.org/r/13781/#comment50681>

    This is not true unit test, should remove from the list
Overall, the patch is non-intrusive. Except for minor refactory in ExecutableManager, existing features should not be affected. I am fine to check in code quickly, then improve the code over time. Before that, we need:
* address existing comments, especially conflict with PIG-3255
* add some e2e tests
* run regression tests

- Daniel Dai
On Sept. 5, 2013, 8:23 p.m., Jeremy Karn wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13781/
> -----------------------------------------------------------
>
> (Updated Sept. 5, 2013, 8:23 p.m.)
>
>
> Review request for pig.
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Changes for PIG-2417 (https://issues.apache.org/jira/browse/PIG-2417)
>
>
> Diffs
> -----
>
>   build.xml 7e22192
>   src/org/apache/pig/PigToStream.java 7cc2950
>   src/org/apache/pig/StreamToPig.java ff24b27
>   src/org/apache/pig/builtin/PigStreaming.java 5467693
>   src/org/apache/pig/builtin/PigToStreamUDF.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamUDFToPig.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingDelimiters.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingUDF.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingUDFException.java PRE-CREATION