Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - How to refer field name in Jython UDF


Copy link to this message
-
Re: How to refer field name in Jython UDF
Jonathan Coveney 2013-02-08, 07:04
Sorry, hit enter prematurely.

Although in this particular case, it's a little janky, but you could have a
helper which takes the thrift class i.e. get_name(some_field, 'SomeClass')
and could use that  SomeClass to let you refer by name.
2013/2/8 Jonathan Coveney <[EMAIL PROTECTED]>

> Currently, the answer to this is no. In Javaland in 0.11.0 you can get the
> schema in an EvalFunc, and it would not be hard to make this available from
> a Jython UDF, though we'd need a patch.
>
>
> 2013/2/7 Stanley Xu <[EMAIL PROTECTED]>
>
>> Dear All,
>>
>> We are using pig with elephant-bird thrift to process structured records.
>> And we were writing tons of UDFs in java before and we are trying to us
>> Jython UDF more since it is much easier to write and deliver(no need to
>> compile and package.)
>>
>> But I am wondering how could I refer a field by name of a tuple in Jython
>> UDF.
>>
>> For example, I have a pig like following:
>>
>> register '/opt/piglib/0.10.0/*.jar';
>> register 'jython_udf.py' using jython as jython;
>>
>> raw_data = load '$INPUT' using
>> com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('SomeClass');
>>
>> A = FOREACH raw_data GENERATE jython.get_name(some_field)
>>
>> And some_field here is a thrift structure, which may have three fields
>> like
>> 'name', 'gender', 'age'.
>>
>> I tried to write the jython udf as the following but failed:
>>
>> @outputSchema("name:chararray")
>> def get_name(input):
>>   return input.name
>>
>> Looks what I got is a tuple in jython. Is there any way I could get it
>> through field name rather than a number of index like input.getField(0)?
>>
>>
>>
>> Best wishes,
>> Stanley Xu
>>
>
>