Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How to refer field name in Jython UDF


+
Stanley Xu 2013-02-08, 03:18
Copy link to this message
-
Re: How to refer field name in Jython UDF
Currently, the answer to this is no. In Javaland in 0.11.0 you can get the
schema in an EvalFunc, and it would not be hard to make this available from
a Jython UDF, though we'd need a patch.
2013/2/7 Stanley Xu <[EMAIL PROTECTED]>

> Dear All,
>
> We are using pig with elephant-bird thrift to process structured records.
> And we were writing tons of UDFs in java before and we are trying to us
> Jython UDF more since it is much easier to write and deliver(no need to
> compile and package.)
>
> But I am wondering how could I refer a field by name of a tuple in Jython
> UDF.
>
> For example, I have a pig like following:
>
> register '/opt/piglib/0.10.0/*.jar';
> register 'jython_udf.py' using jython as jython;
>
> raw_data = load '$INPUT' using
> com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('SomeClass');
>
> A = FOREACH raw_data GENERATE jython.get_name(some_field)
>
> And some_field here is a thrift structure, which may have three fields like
> 'name', 'gender', 'age'.
>
> I tried to write the jython udf as the following but failed:
>
> @outputSchema("name:chararray")
> def get_name(input):
>   return input.name
>
> Looks what I got is a tuple in jython. Is there any way I could get it
> through field name rather than a number of index like input.getField(0)?
>
>
>
> Best wishes,
> Stanley Xu
>
+
Jonathan Coveney 2013-02-08, 07:04