Mat Kelcey 2012-08-29, 23:55
Mat Kelcey 2012-08-30, 00:08
Jonathan Coveney 2012-08-30, 00:06
Mat Kelcey 2012-08-30, 00:14
Mat Kelcey 2012-08-30, 00:29
Mat Kelcey 2012-08-30, 00:48
Join on a dummy key or CROSS, then plug the token in a udf.
On Aug 29, 2012, at 4:56 PM, Mat Kelcey <[EMAIL PROTECTED]> wrote:
> Considering the following two relations...
> grunt> querys = load 'query' as (id:int, token:chararray);
> grunt> dump querys
> grunt> documents = load 'document' as (id:int, text:chararray);
> grunt> dump documents;
> (21,foo bar frog)
> (22,hello frog)
> Is is possible to do a join where the query:token is not equal to but
> contained in documents:text ?
> (11,foo,21,foo bar frog)
> (12,bar,21,foo bar frog)
> (13,frog,21,foo bar frog)
> (13,frog,22,hello frog)
> I can certainly do this in Java map/reduce (as we all had to in the
> dark days days before pig) but is there a way to hack this together
> with a custom udf or some other weird join backdoor (customer
> partitioner for a group or something whacky) ???
> It's been a long day, maybe I'm just missing some super obvious..