Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Using a UDF written in Python


Copy link to this message
-
Re: Using a UDF written in Python
I think decorator used here is incorrect.
In general, "output:chararray" needs to be schema-string-compatible. Also,
you are using "outputSchemaFunction", which is used in case you want to
write a udf that has output schema dependent on input schema (�g -square)
and this should have a function with decorator "schemaFunction" (named
"output" in your case). I think using "outputSchema" decorator would fix
the problem here.

More details can be found at-
http://wiki.apache.org/pig/UDFsUsingScriptingLanguages

Thanks,
Aniket

On Mon, December 27, 2010 4:30 pm, Jonathan Coveney wrote:
> so I have module.py, and I want to be able to use it in a pig script. It
> has no special imports or anything. I do have
> @outputSchemaFunction("output:chararray)
>
>
> In my pig script, I have this
>
>
> register '/my/udf/location/udf.py' using jython as myfunc;
>
> is there any reason why this wouldn't work? here is the error I get:
>
> 2010-12-27 16:29:41,288 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2998: Unhandled internal error. org/python/util/PythonInterpreter
>
>
> Not the most instructive error, but is there anything more I need to be
> doing to be able to use a python UDF?
>
> As an aside, are simply python UDF's as efficient as Java ones? I like
> Python a lot and love the idea of being able to UDF in it, but can use
> java if necessary.
>