Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Attach bag for each tuple and pass to UDF


Copy link to this message
-
Re: Attach bag for each tuple and pass to UDF
Daniel Dai 2013-10-23, 21:09
Can you do a cross?
On Mon, Oct 21, 2013 at 2:21 PM, Serega Sheypak <[EMAIL PROTECTED]>wrote:

> Hi, I have two relations:
> relation *rows* (>10GB)
> relation *tinyDictionary* (<1MB)
>
> I want to take each tuple from *rows* and attach *tinyDictionary *to it.
> And then pass it to python UDF:
>
> result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_*
> Rows*, whole*TinyDictionary*);
>
> How can I do that?
>
> There is a solution to do it using DistirbutedCache, but I would like to
> avoid to use Java stuff.
> Also *TinyDictionary *could be in several files. It would be hard to deal
> with it.
>

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.