-Attach bag for each tuple and pass to UDF
Serega Sheypak 2013-10-21, 21:21
Hi, I have two relations:
relation *rows* (>10GB)
relation *tinyDictionary* (<1MB)
I want to take each tuple from *rows* and attach *tinyDictionary *to it.
And then pass it to python UDF:
result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_*
How can I do that?
There is a solution to do it using DistirbutedCache, but I would like to
avoid to use Java stuff.
Also *TinyDictionary *could be in several files. It would be hard to deal