Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Accessing tuple field names from within a python udf


Copy link to this message
-
Re: Accessing tuple field names from within a python udf
Jonathan Coveney 2012-11-15, 20:20
Martin,

That is a reasonable workaround. Even in java UDF's, you can't directly
access fields by name. Tuples are indexed only by numbers. Using the Schema
is how I would do it.
2012/11/14 Martin Goodson <[EMAIL PROTECTED]>

> Sorry to reply to my question post but I've found a workaround that I
> thought I should put here:
>
> use embedded pig
> access the schema with boundscript.describe().
> input the schema as a parameter into the udf call.
>
> Thanks
> Martin
>
>
>
>
> On 14 November 2012 16:17, Martin Goodson <[EMAIL PROTECTED]>
> wrote:
>
> > I normally deal with very large tuples with many fields. Its a pain to
> > deal with these in python udfs since I can't figure out a way to input
> > schemas into the udf. I have to hard code the column number in the UDFs,
> > which is a maintenance nightmare.
> >
> > It seems that java UDFs receive the full tuple in their exec methods so
> > that the correct fields can be identified, whereas python UDFs only
> receive
> > lists objects (with field names stripped). Is there any way to get the
> > behaviour of python UDFs to conform to the java behaviour?
> >
> >
> > Thanks for any ideas
> > Martin
> >
> >
>