Hi, I have two relations:
relation *rows* (>10GB)
relation *tinyDictionary* (<1MB)
I want to take each tuple from *rows* and attach *tinyDictionary *to it.
And then pass it to python UDF:
result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_*
How can I do that?
There is a solution to do it using DistirbutedCache, but I would like to
avoid to use Java stuff.
Also *TinyDictionary *could be in several files. It would be hard to deal
Daniel Dai 2013-10-23, 21:09
Pradeep Gollakota 2013-10-23, 22:32