Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - How to refer field name in Jython UDF


Copy link to this message
-
How to refer field name in Jython UDF
Stanley Xu 2013-02-08, 03:18
Dear All,

We are using pig with elephant-bird thrift to process structured records.
And we were writing tons of UDFs in java before and we are trying to us
Jython UDF more since it is much easier to write and deliver(no need to
compile and package.)

But I am wondering how could I refer a field by name of a tuple in Jython
UDF.

For example, I have a pig like following:

register '/opt/piglib/0.10.0/*.jar';
register 'jython_udf.py' using jython as jython;

raw_data = load '$INPUT' using
com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('SomeClass');

A = FOREACH raw_data GENERATE jython.get_name(some_field)

And some_field here is a thrift structure, which may have three fields like
'name', 'gender', 'age'.

I tried to write the jython udf as the following but failed:

@outputSchema("name:chararray")
def get_name(input):
  return input.name

Looks what I got is a tuple in jython. Is there any way I could get it
through field name rather than a number of index like input.getField(0)?

Best wishes,
Stanley Xu
+
Jonathan Coveney 2013-02-08, 07:03
+
Jonathan Coveney 2013-02-08, 07:04