What would be the best way to write this script?
I have two datasets - huge (hkey, hdata), small(skey). I want to filter
all the data from huge dataset for which F(hdata, skey) is true.
huge = load 'mydata' as (key:chararray, value:chararray);
small = load 'smalldata' as skey:chararray;
h_s_cross = cross huge, small;
filtered = foreach h_s_cross generate CONTAINS(value, skey);