


how can I distinct one field of a relation
hi, How can I distinct only one field of a relation? here's the demo:
A = LOAD 'data' AS (a1:int,a2:int,a3:int); B = distinct A by a1; how can I do this?
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final

Re: how can I distinct one field of a relation
What is your desired output? Sounds like you want a group.
2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
> hi, > How can I distinct only one field of a relation? > here's the demo: > > A = LOAD 'data' AS (a1:int,a2:int,a3:int); > B = distinct A by a1; > > > how can I do this? > > > > Haitao Yao > [EMAIL PROTECTED] > weibo: @haitao_yao > Skype: haitao.yao.final > >

Re: how can I distinct one field of a relation
I want a subset of A with a1 value distinct. the current distinct will compare all the fields in A, which is not what I want.
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final
在 2012627，上午11:18， Jonathan Coveney 写道：
> What is your desired output? Sounds like you want a group. > > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> > >> hi, >> How can I distinct only one field of a relation? >> here's the demo: >> >> A = LOAD 'data' AS (a1:int,a2:int,a3:int); >> B = distinct A by a1; >> >> >> how can I do this? >> >> >> >> Haitao Yao >> [EMAIL PROTECTED] >> weibo: @haitao_yao >> Skype: haitao.yao.final >> >>

Re: how can I distinct one field of a relation
If you JUST want a1, then you would do A = LOAD 'data' AS (a1:int,a2:int,a3:int); B = DISTINCT (foreach A generate a1);
basically you project the column you want, and distinct on it.
2012/6/26 Haitao Yao <[EMAIL PROTECTED]>
> I want a subset of A with a1 value distinct. > the current distinct will compare all the fields in A, which is not what I > want. > > > > Haitao Yao > [EMAIL PROTECTED] > weibo: @haitao_yao > Skype: haitao.yao.final > > 在 2012627，上午11:18， Jonathan Coveney 写道： > > > What is your desired output? Sounds like you want a group. > > > > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> > > > >> hi, > >> How can I distinct only one field of a relation? > >> here's the demo: > >> > >> A = LOAD 'data' AS (a1:int,a2:int,a3:int); > >> B = distinct A by a1; > >> > >> > >> how can I do this? > >> > >> > >> > >> Haitao Yao > >> [EMAIL PROTECTED] > >> weibo: @haitao_yao > >> Skype: haitao.yao.final > >> > >> > >

Re: how can I distinct one field of a relation
will, not exactly. I want a subset of A with all fields, and field a1 is distinct. for example: A is: 1,2,3 1,2,3 4,5,6
What I want is : 1,2,3 4,5,6
How can I do this with the keyword distinct?
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final
在 2012627，下午2:06， Jonathan Coveney 写道：
> If you JUST want a1, then you would do > A = LOAD 'data' AS (a1:int,a2:int,a3:int); > B = DISTINCT (foreach A generate a1); > > basically you project the column you want, and distinct on it. > > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> > >> I want a subset of A with a1 value distinct. >> the current distinct will compare all the fields in A, which is not what I >> want. >> >> >> >> Haitao Yao >> [EMAIL PROTECTED] >> weibo: @haitao_yao >> Skype: haitao.yao.final >> >> 在 2012627，上午11:18， Jonathan Coveney 写� 溃�>> >>> What is your desired output? Sounds like you want a group. >>> >>> 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> >>> >>>> hi, >>>> How can I distinct only one field of a relation? >>>> here's the demo: >>>> >>>> A = LOAD 'data' AS (a1:int,a2:int,a3:int); >>>> B = distinct A by a1; >>>> >>>> >>>> how can I do this? >>>> >>>> >>>> >>>> Haitao Yao >>>> [EMAIL PROTECTED] >>>> weibo: @haitao_yao >>>> Skype: haitao.yao.final >>>> >>>> >> >>

Re: how can I distinct one field of a relation
If those values/fields that differ are not a problem to exclude, then may be you can use a FILTER to exclude.... Also as @Jonathan said, you may project fields you want and then distinct. He just gave a example of generating a1, you may take more fields in foreach..generate clause
On Wed, Jun 27, 2012 at 3:17 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
> will, not exactly. > I want a subset of A with all fields, and field a1 is distinct. > for example: > A is: > 1,2,3 > 1,2,3 > 4,5,6 > > What I want is : > 1,2,3 > 4,5,6 > > How can I do this with the keyword distinct? > > > > Haitao Yao > [EMAIL PROTECTED] > weibo: @haitao_yao > Skype: haitao.yao.final > > 在 2012627，下午2:06， Jonathan Coveney 写道： > > > If you JUST want a1, then you would do > > A = LOAD 'data' AS (a1:int,a2:int,a3:int); > > B = DISTINCT (foreach A generate a1); > > > > basically you project the column you want, and distinct on it. > > > > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> > > > >> I want a subset of A with a1 value distinct. > >> the current distinct will compare all the fields in A, which is not > what I > >> want. > >> > >> > >> > >> Haitao Yao > >> [EMAIL PROTECTED] > >> weibo: @haitao_yao > >> Skype: haitao.yao.final > >> > >> 在 2012627，上午11:18， Jonathan Coveney 写道： > >> > >>> What is your desired output? Sounds like you want a group. > >>> > >>> 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> > >>> > >>>> hi, > >>>> How can I distinct only one field of a relation? > >>>> here's the demo: > >>>> > >>>> A = LOAD 'data' AS (a1:int,a2:int,a3:int); > >>>> B = distinct A by a1; > >>>> > >>>> > >>>> how can I do this? > >>>> > >>>> > >>>> > >>>> Haitao Yao > >>>> [EMAIL PROTECTED] > >>>> weibo: @haitao_yao > >>>> Skype: haitao.yao.final > >>>> > >>>> > >> > >> > >

Re: how can I distinct one field of a relation
Hey Haitao,
I didn't get exactly what your requirement was and your example seems to be incomplete. Here it is:
A is: 1,2,3 1,2,3 4,5,6
What I want is : 1,2,3 4,5,6
What you did here is DISTINCT'ed by all fields, but what if the input is 1,2,3 1,3,4 4,5,6 and you are trying to DISTINCT by the first field. What output do you want for such a case? Ruslan
On Wed, Jun 27, 2012 at 3:25 PM, Subir S <[EMAIL PROTECTED]> wrote: > If those values/fields that differ are not a problem to exclude, then may > be you can use a FILTER to exclude.... > Also as @Jonathan said, you may project fields you want and then distinct. > He just gave a example of generating a1, you may take more fields in > foreach..generate clause > > On Wed, Jun 27, 2012 at 3:17 PM, Haitao Yao <[EMAIL PROTECTED]> wrote: > >> will, not exactly. >> I want a subset of A with all fields, and field a1 is distinct. >> for example: >> A is: >> 1,2,3 >> 1,2,3 >> 4,5,6 >> >> What I want is : >> 1,2,3 >> 4,5,6 >> >> How can I do this with the keyword distinct? >> >> >> >> Haitao Yao >> [EMAIL PROTECTED] >> weibo: @haitao_yao >> Skype: haitao.yao.final >> >> 在 2012627，下午2:06， Jonathan Coveney 写道： >> >> > If you JUST want a1, then you would do >> > A = LOAD 'data' AS (a1:int,a2:int,a3:int); >> > B = DISTINCT (foreach A generate a1); >> > >> > basically you project the column you want, and distinct on it. >> > >> > 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> >> > >> >> I want a subset of A with a1 value distinct. >> >> the current distinct will compare all the fields in A, which is not >> what I >> >> want. >> >> >> >> >> >> >> >> Haitao Yao >> >> [EMAIL PROTECTED] >> >> weibo: @haitao_yao >> >> Skype: haitao.yao.final >> >> >> >> 在 2012627，上午11:18， Jonathan Coveney 写道： >> >> >> >>> What is your desired output? Sounds like you want a group. >> >>> >> >>> 2012/6/26 Haitao Yao <[EMAIL PROTECTED]> >> >>> >> >>>> hi, >> >>>> How can I distinct only one field of a relation? >> >>>> here's the demo: >> >>>> >> >>>> A = LOAD 'data' AS (a1:int,a2:int,a3:int); >> >>>> B = distinct A by a1; >> >>>> >> >>>> >> >>>> how can I do this? >> >>>> >> >>>> >> >>>> >> >>>> Haitao Yao >> >>>> [EMAIL PROTECTED] >> >>>> weibo: @haitao_yao >> >>>> Skype: haitao.yao.final >> >>>> >> >>>> >> >> >> >> >> >>

