Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - calling python udfs with varargs


Copy link to this message
-
Re: calling python udfs with varargs
Julien Le Dem 2011-10-17, 20:01
https://issues.apache.org/jira/browse/PIG-2322

On Mon, Oct 17, 2011 at 12:38 PM, Stan Rosenberg <
[EMAIL PROTECTED]> wrote:

> Hi Julien,
>
> Thanks for a quick reply.  I patched my local version of
> JythonFunction to pass the input parameters when 'varargs' is true.
>
> stan
>
> On Mon, Oct 17, 2011 at 2:26 PM, Julien Le Dem <[EMAIL PROTECTED]> wrote:
> > Hi,
> > I'm looking into it. Internally varargs advertise themselves as having 0
> > args so I need to add a special case in the JythonFunction to handle
> > varargs. I'll create a JIRA for this.
> > For now you can not use varargs as they will always be called with no
> > parameters.
> > Julien
> >
> > On Mon, Oct 17, 2011 at 9:54 AM, Stan Rosenberg <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >>
> >> I have a simple python udf which takes a variable number of (string)
> >> arguments and returns the first non-empty one.
> >> I can see that the udf is invoked from pig but no arguments are being
> >> passed.
> >>
> >> Here is the script:
> >> ========================================================> >>
> >> #!/usr/bin/python
> >>
> >> from org.apache.pig.scripting import *
> >>
> >> @outputSchema("s:chararray")
> >> def firstNonempty(*args):
> >>    print args
> >>    for v in args:
> >>        if len(v) != 0:
> >>           return v
> >>    return ''
> >>
> >> if __name__ == "__main__":
> >>   Pig.compile("""
> >>   data = load 'input.txt' AS (string1:chararray, string2:chararray);
> >>   data = foreach data generate firstNonempty(string1, string2) as id,
> >> string1, string2;
> >>   dump data;
> >>   """).bind().runSingle()
> >>
> >> ==========================================================> >>
> >> Thanks!
> >>
> >> stan
> >>
> >
>