Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Lookup in a dataset


Copy link to this message
-
Re: Lookup in a dataset
Swaroop Kumar Patra 2013-11-14, 14:13
Thanks Aaron for replay.

I will try this out.

Thanks,
Swaroop

On 14-Nov-2013, at 5:37 pm, Aaron Zimmerman <[EMAIL PROTECTED]> wrote:

> You’ll want to use COGROUP.
>
> Something like
>
> x = COGROUP input1 by col3, input2 by col4;
>
> needed = FILTER x by IsEmpty(input2);
>
>
> Thanks,
>
> Aaron Zimmerman
> Platform Engineer
> Sprout Social
> 773.227.7528
> @apzimmerman
> sproutsocial.com
>
> On November 14, 2013 at 1:19:46 AM, Swaroop Patra ([EMAIL PROTECTED]) wrote:
>
>> Hi All,
>>
>> I need little help on scripting below condition.
>>
>> I have 2 input tab separated files. Lets consider input1 and input2.
>> input1
>> ---------
>> col1 col2 col3
>> input2
>> --------
>> col4
>>
>> I have to fetch records from input1 where col3 value is not present in
>> input2.col4
>>
>> e.g.
>> input1
>> ----------
>> 11 12 13
>> 21 22 23
>> 31 32 33
>> 41 42 43
>> Input2
>> ---------
>> 12
>> 23
>> 33
>> 45
>>
>>
>> output
>> ---------
>> 11 12 13
>> 41 42 43
>>
>> As 33(input1.row3.col3) & 43 is not available in input2.col4.
>>
>> Thanks & Regards,
>> Swaroop