Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Accessing tuple field names from within a python udf


+
Martin Goodson 2012-11-14, 16:17
Copy link to this message
-
Re: Accessing tuple field names from within a python udf
Sorry to reply to my question post but I've found a workaround that I
thought I should put here:

use embedded pig
access the schema with boundscript.describe().
input the schema as a parameter into the udf call.

Thanks
Martin
On 14 November 2012 16:17, Martin Goodson <[EMAIL PROTECTED]> wrote:

> I normally deal with very large tuples with many fields. Its a pain to
> deal with these in python udfs since I can't figure out a way to input
> schemas into the udf. I have to hard code the column number in the UDFs,
> which is a maintenance nightmare.
>
> It seems that java UDFs receive the full tuple in their exec methods so
> that the correct fields can be identified, whereas python UDFs only receive
> lists objects (with field names stripped). Is there any way to get the
> behaviour of python UDFs to conform to the java behaviour?
>
>
> Thanks for any ideas
> Martin
>
>
+
Jonathan Coveney 2012-11-15, 20:20
+
Martin Goodson 2012-11-16, 10:21
+
Jonathan Coveney 2012-11-16, 19:21