Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Streaming UDFs

Copy link to this message
Streaming UDFs
We've been using pig's jython UDF support and really enjoying it, but we're finding several cases where we need python modules with C extensions, which jython doesn't support.

While we could use the STREAM operator to make this work, it'd be great to have the simplicity, type-checking/casting, and exact-field-using of UDFs.   I think we could get that by adding Streaming UDFs, for which I've sketched an idea on the wiki: https://cwiki.apache.org/confluence/display/PIG/StreamingUDFs

It's still just a sketch, but I'd love feedback on the direction, or any other ideas if people have thought about it in the past.