Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to refer field name in Jython UDF


Copy link to this message
-
Re: How to refer field name in Jython UDF
Sorry, hit enter prematurely.

Although in this particular case, it's a little janky, but you could have a
helper which takes the thrift class i.e. get_name(some_field, 'SomeClass')
and could use that  SomeClass to let you refer by name.
2013/2/8 Jonathan Coveney <[EMAIL PROTECTED]>

> Currently, the answer to this is no. In Javaland in 0.11.0 you can get the
> schema in an EvalFunc, and it would not be hard to make this available from
> a Jython UDF, though we'd need a patch.
>
>
> 2013/2/7 Stanley Xu <[EMAIL PROTECTED]>
>
>> Dear All,
>>
>> We are using pig with elephant-bird thrift to process structured records.
>> And we were writing tons of UDFs in java before and we are trying to us
>> Jython UDF more since it is much easier to write and deliver(no need to
>> compile and package.)
>>
>> But I am wondering how could I refer a field by name of a tuple in Jython
>> UDF.
>>
>> For example, I have a pig like following:
>>
>> register '/opt/piglib/0.10.0/*.jar';
>> register 'jython_udf.py' using jython as jython;
>>
>> raw_data = load '$INPUT' using
>> com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('SomeClass');
>>
>> A = FOREACH raw_data GENERATE jython.get_name(some_field)
>>
>> And some_field here is a thrift structure, which may have three fields
>> like
>> 'name', 'gender', 'age'.
>>
>> I tried to write the jython udf as the following but failed:
>>
>> @outputSchema("name:chararray")
>> def get_name(input):
>>   return input.name
>>
>> Looks what I got is a tuple in jython. Is there any way I could get it
>> through field name rather than a number of index like input.getField(0)?
>>
>>
>>
>> Best wishes,
>> Stanley Xu
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB