Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Review Request 13781: Changes to add support for streaming_python udfs.

Copy link to this message
Re: Review Request 13781: Changes to add support for streaming_python udfs.
Daniel Dai 2013-09-07, 00:04

This is an automatically generated e-mail. To reply, visit:


    No code change, remove this section from patch


    No code change, remove this section from patch


    No code change, remove this section from patch


    I feel those classes StreamingUDF should go to package org.apache.pig.impl.builtin, StreamingDelimiters, StreamUDFToPig, PigToStreamUDF, etc should go to org.apache.pig.impl.streaming


    I personally like to move the last element processing to the for loop:
            for (int i=0; i < sz; i++) {
                field = t.get(i);
                StorageUtil.putField(out, field, DELIMS, true);
                if (i!=sz-1) {


    This might be affected by PIG-3255


    How do we handle delimit escaping? Seems we don't handle it right now?


    Are those files only for illustrate? Why not just use a random tmp file? Why do we distinguish exectype in illustrate? It only run locally.


    Any reason not to create a StreamingUDFInputHandler?


    Can we use FileUtils.copyFile?


    Seems flush() is more proper to put inside inputHandler


    Why we need this change?


    This is not true unit test, should remove from the list
Overall, the patch is non-intrusive. Except for minor refactory in ExecutableManager, existing features should not be affected. I am fine to check in code quickly, then improve the code over time. Before that, we need:
* address existing comments, especially conflict with PIG-3255
* add some e2e tests
* run regression tests

- Daniel Dai
On Sept. 5, 2013, 8:23 p.m., Jeremy Karn wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13781/
> -----------------------------------------------------------
> (Updated Sept. 5, 2013, 8:23 p.m.)
> Review request for pig.
> Repository: pig-git
> Description
> -------
> Changes for PIG-2417 (https://issues.apache.org/jira/browse/PIG-2417)
> Diffs
> -----
>   build.xml 7e22192
>   src/org/apache/pig/PigToStream.java 7cc2950
>   src/org/apache/pig/StreamToPig.java ff24b27
>   src/org/apache/pig/builtin/PigStreaming.java 5467693
>   src/org/apache/pig/builtin/PigToStreamUDF.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamUDFToPig.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingDelimiters.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingUDF.java PRE-CREATION
>   src/org/apache/pig/builtin/StreamingUDFException.java PRE-CREATION