Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> pig script similar to select from not in in SQL


+
Chan, Tim 2012-01-24, 21:47
+
Jonathan Coveney 2012-01-24, 21:53
Copy link to this message
-
RE: pig script similar to select from not in in SQL
Thanks Jonathan.

I did the following:

a = load 'data1';
b = load 'data2';
c = join a by $0 left outer, b by $0;
d = filter c by $1 is null;
e = foreach d generate $0;
________________________________________
From: Jonathan Coveney [[EMAIL PROTECTED]]
Sent: Tuesday, January 24, 2012 1:53 PM
To: [EMAIL PROTECTED]
Subject: Re: pig script similar to select from not in in SQL

I would do the following (obviously this is a bit shorthand):

a = load 'data1';
b = load 'data2';
c = cogroup a by $0, b by $0;
d = filter c by IsEmpty(b);

d would be a relation with only the keys and their corresponding rows which
exist in a

2012/1/24 Chan, Tim <[EMAIL PROTECTED]>

> I would like to generate a set of data that represents the items not found
> in another set.
> How would I do this using Pig?
>
> I'm thinking I would do an outer join and then filter off the items that
> were matched.
>
>