|
|
-
schema string in python UDFDoug Daniels 2011-09-30, 19:26
Small question—the python UDF doc says that "variable names inside a schema string are not used anywhere, they just make the syntax identifiable to the parser" (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction). However, it looks like pig is picking up those field names and keeping them if I don't override them.
For instance if I have a python UDF: @outputSchema('a:int') def my_udf(x): return 123 And a pig script: raw = LOAD 'data.txt' USING PigStorage() AS (x:int); with_udf = FOREACH raw GENERATE my_udfs.my_udf(x); Running describe on with_udf gives me: with_udf: {a: int} Is the doc incorrect there? Thanks, Doug |