Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - How to refer field name in Jython UDF


+
Stanley Xu 2013-02-08, 03:18
Copy link to this message
-
Re: How to refer field name in Jython UDF
Jonathan Coveney 2013-02-08, 07:03
Currently, the answer to this is no. In Javaland in 0.11.0 you can get the
schema in an EvalFunc, and it would not be hard to make this available from
a Jython UDF, though we'd need a patch.
2013/2/7 Stanley Xu <[EMAIL PROTECTED]>

> Dear All,
>
> We are using pig with elephant-bird thrift to process structured records.
> And we were writing tons of UDFs in java before and we are trying to us
> Jython UDF more since it is much easier to write and deliver(no need to
> compile and package.)
>
> But I am wondering how could I refer a field by name of a tuple in Jython
> UDF.
>
> For example, I have a pig like following:
>
> register '/opt/piglib/0.10.0/*.jar';
> register 'jython_udf.py' using jython as jython;
>
> raw_data = load '$INPUT' using
> com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('SomeClass');
>
> A = FOREACH raw_data GENERATE jython.get_name(some_field)
>
> And some_field here is a thrift structure, which may have three fields like
> 'name', 'gender', 'age'.
>
> I tried to write the jython udf as the following but failed:
>
> @outputSchema("name:chararray")
> def get_name(input):
>   return input.name
>
> Looks what I got is a tuple in jython. Is there any way I could get it
> through field name rather than a number of index like input.getField(0)?
>
>
>
> Best wishes,
> Stanley Xu
>
+
Jonathan Coveney 2013-02-08, 07:04