Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> pig script similar to select from not in in SQL


Copy link to this message
-
RE: pig script similar to select from not in in SQL
Thanks Jonathan.

I did the following:

a = load 'data1';
b = load 'data2';
c = join a by $0 left outer, b by $0;
d = filter c by $1 is null;
e = foreach d generate $0;
________________________________________
From: Jonathan Coveney [[EMAIL PROTECTED]]
Sent: Tuesday, January 24, 2012 1:53 PM
To: [EMAIL PROTECTED]
Subject: Re: pig script similar to select from not in in SQL

I would do the following (obviously this is a bit shorthand):

a = load 'data1';
b = load 'data2';
c = cogroup a by $0, b by $0;
d = filter c by IsEmpty(b);

d would be a relation with only the keys and their corresponding rows which
exist in a

2012/1/24 Chan, Tim <[EMAIL PROTECTED]>

> I would like to generate a set of data that represents the items not found
> in another set.
> How would I do this using Pig?
>
> I'm thinking I would do an outer join and then filter off the items that
> were matched.
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB