Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how can I distinct one field of a relation


Copy link to this message
-
Re: how can I distinct one field of a relation
If you JUST want a1, then you would do
A = LOAD 'data' AS (a1:int,a2:int,a3:int);
B = DISTINCT (foreach A generate a1);

basically you project the column you want, and distinct on it.

2012/6/26 Haitao Yao <[EMAIL PROTECTED]>

> I want a subset of A with a1 value distinct.
> the current distinct will compare all the fields in A, which is not what I
> want.
>
>
>
> Haitao Yao
> [EMAIL PROTECTED]
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> 在 2012-6-27,上午11:18, Jonathan Coveney 写道:
>
> > What is your desired output? Sounds like you want a group.
> >
> > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
> >
> >> hi,
> >>       How can I distinct only one field of a relation?
> >>       here's the demo:
> >>
> >>       A = LOAD 'data' AS (a1:int,a2:int,a3:int);
> >>       B = distinct A by a1;
> >>
> >>
> >>       how can I do this?
> >>
> >>
> >>
> >> Haitao Yao
> >> [EMAIL PROTECTED]
> >> weibo: @haitao_yao
> >> Skype:  haitao.yao.final
> >>
> >>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB