Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - how can I distinct one field of a relation


Copy link to this message
-
Re: how can I distinct one field of a relation
Jonathan Coveney 2012-06-27, 06:06
If you JUST want a1, then you would do
A = LOAD 'data' AS (a1:int,a2:int,a3:int);
B = DISTINCT (foreach A generate a1);

basically you project the column you want, and distinct on it.

2012/6/26 Haitao Yao <[EMAIL PROTECTED]>

> I want a subset of A with a1 value distinct.
> the current distinct will compare all the fields in A, which is not what I
> want.
>
>
>
> Haitao Yao
> [EMAIL PROTECTED]
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> 在 2012-6-27,上午11:18, Jonathan Coveney 写道:
>
> > What is your desired output? Sounds like you want a group.
> >
> > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
> >
> >> hi,
> >>       How can I distinct only one field of a relation?
> >>       here's the demo:
> >>
> >>       A = LOAD 'data' AS (a1:int,a2:int,a3:int);
> >>       B = distinct A by a1;
> >>
> >>
> >>       how can I do this?
> >>
> >>
> >>
> >> Haitao Yao
> >> [EMAIL PROTECTED]
> >> weibo: @haitao_yao
> >> Skype:  haitao.yao.final
> >>
> >>
>
>