-Filter on contents of other dataset
Aniket Mokashi 2011-04-15, 03:21
What would be the best way to write this script?
I have two datasets - huge (hkey, hdata), small(skey). I want to filter
all the data from huge dataset for which F(hdata, skey) is true.
huge = load 'mydata' as (key:chararray, value:chararray);
small = load 'smalldata' as skey:chararray;
h_s_cross = cross huge, small;
filtered = foreach h_s_cross generate CONTAINS(value, skey);