Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - schema string in python UDF


+
Doug Daniels 2011-09-30, 19:26
Copy link to this message
-
Re: schema string in python UDF
Alan Gates 2011-09-30, 19:30
Looks like it, which is good.  The behavior you're seeing is what we want.

Alan.

On Sep 30, 2011, at 12:26 PM, Doug Daniels wrote:

> Small question—the python UDF doc says that "variable names inside a schema string are not used anywhere, they just make the syntax identifiable to the parser"  (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction).  However, it looks like pig is picking up those field names and keeping them if I don't override them.
>
> For instance if I have a python UDF:
>
> @outputSchema('a:int')
> def my_udf(x):
>    return 123
>
> And a pig script:
>
> raw = LOAD 'data.txt' USING PigStorage() AS (x:int);
> with_udf = FOREACH raw GENERATE my_udfs.my_udf(x);
>
> Running describe on with_udf gives me:
>
> with_udf: {a: int}
>
> Is the doc incorrect there?
>
> Thanks,
> Doug