|
|
-
Re: NOT IN and EXCEPTThejas M Nair 2010-08-22, 05:16
On 8/21/10 10:45 AM, "Defenestrator" <[EMAIL PROTECTED]> wrote: > I come from the DBMS world and am not really familiar with PIG, so hopefully > I'm asking reasonable questions. > > I was basically wondering if there are patterns in PIG to do the following > set operations: > > 1. select * from foo where foo.a NOT IN (select x from bar); > 2. select a, b from foo EXCEPT select x, y from bar; 1 can be implemented as left outer join with . In sql its equivalent to - select * from foo left outer join bar on (foo.a bar.x) where bar.x is null; In pig-latin you can do- J = join foo by a LEFT, bar by x ; F = filter J by x is null; Or , use cogroup - CG = cogroup foo by a, bar by x; F = filter CG by SIZE(bar) == 0; 2. the difference between 'not in' and 'except' is that you do a distinct on the columns of foo . foo_ab = foreach foo generate a,b; distinct_foo = distinct foo_ab; CG = cogroup distinct_foo by (a,b), bar by (x,y); F = filter CG by SIZE(bar) == 0; |