Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Does pig support in clause?


Copy link to this message
-
Re: Does pig support in clause?
Agreed.  And with some optimization we could make semi-join more efficient than this since it only needs to keep one record per key per map instead of all the records for a key.

Alan.

On Jun 25, 2012, at 10:17 AM, Russell Jurney wrote:

> This could be a cool rewrite feature like CUBE/SAMPLE.
>
> Russell Jurney http://datasyndrome.com
>
> On Jun 25, 2012, at 9:39 AM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> This type of in is really a semi-join.  So you could rewrite this as:
>>
>> B1 = join A by A1, C by A1;
>> B2 = filter B1 by SIZE(C) > 0;
>> B = foreach B2 flatten(A);
>>
>> Alan.
>>
>> On Jun 25, 2012, at 2:50 AM, yonghu wrote:
>>
>>> Dear all,
>>>
>>> in the sql, there is a in clause  which is used to check if the value
>>> is in a set or not? Does pig also have the same in clause? Such as:
>>>
>>> B = filter A by A1 in C;
>>>
>>> A,B,C are relation names and A1 is a column_name of A.
>>>
>>> Thanks!
>>>
>>> Yong
>>