Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how can I distinct one field of a relation


Copy link to this message
-
Re: how can I distinct one field of a relation
will, not exactly.
I want a subset of A with all fields, and field a1 is distinct.
for example:
A is:
1,2,3
1,2,3
4,5,6

What I want is :
1,2,3
4,5,6

How can I do this with the keyword distinct?

Haitao Yao
[EMAIL PROTECTED]
weibo: @haitao_yao
Skype:  haitao.yao.final

在 2012-6-27,下午2:06, Jonathan Coveney 写道:

> If you JUST want a1, then you would do
> A = LOAD 'data' AS (a1:int,a2:int,a3:int);
> B = DISTINCT (foreach A generate a1);
>
> basically you project the column you want, and distinct on it.
>
> 2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
>
>> I want a subset of A with a1 value distinct.
>> the current distinct will compare all the fields in A, which is not what I
>> want.
>>
>>
>>
>> Haitao Yao
>> [EMAIL PROTECTED]
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> 在 2012-6-27,上午11:18, Jonathan Coveney 写�
溃�>>
>>> What is your desired output? Sounds like you want a group.
>>>
>>> 2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
>>>
>>>> hi,
>>>>      How can I distinct only one field of a relation?
>>>>      here's the demo:
>>>>
>>>>      A = LOAD 'data' AS (a1:int,a2:int,a3:int);
>>>>      B = distinct A by a1;
>>>>
>>>>
>>>>      how can I do this?
>>>>
>>>>
>>>>
>>>> Haitao Yao
>>>> [EMAIL PROTECTED]
>>>> weibo: @haitao_yao
>>>> Skype:  haitao.yao.final
>>>>
>>>>
>>
>>