|
|
Dear All: this is the description of wiki about distinct:
grunt> A = load 'mydata' using PigStorage() as (a, b, c); grunt>B = group A by a; grunt> C = foreach B { D = distinct A.b; generate flatten(group), COUNT(D); } but if filed b have sub fileds,for example: A = load 'mydata' using PigStorage() as (a, b(b1,b2,b3), c); if i want to distinct D = distinct A.b.b1,how can i do?because pig is not allowed to use D = distinct A.b.b1; Thank you!
Thejas Nair 2012-03-09, 03:00
On 3/5/12 7:19 PM, guoyun wrote: > Dear All: > this is the description of wiki about distinct: > > grunt> A = load 'mydata' using PigStorage() as (a, b, c); > grunt>B = group A by a; > grunt> C = foreach B { > D = distinct A.b; > generate flatten(group), COUNT(D); > } > > but if filed b have sub fileds,for example: > A = load 'mydata' using PigStorage() as (a, b(b1,b2,b3), c); > > if i want to distinct D = distinct A.b.b1,how can i do?because pig is > not allowed to use D = distinct A.b.b1; > > Thank you! > > > You need to use another nested foreach statement. -
C = foreach B { B1BAG = foreach A generate b.b1; D = distinct B1BAG; generate flatten(group), COUNT(D);}
-Thejas
> On 3/5/12 7:19 PM, guoyun wrote: > > Dear All: > > this is the description of wiki about distinct: > > > > grunt> A = load 'mydata' using PigStorage() as (a, b, c); > > grunt>B = group A by a; > > grunt> C = foreach B { > > D = distinct A.b; > > generate flatten(group), COUNT(D); > > } > > > > but if filed b have sub fileds,for example: > > A = load 'mydata' using PigStorage() as (a, b(b1,b2,b3), c); > > > > if i want to distinct D = distinct A.b.b1,how can i do?because pig is > > not allowed to use D = distinct A.b.b1; > > > > Thank you! > > > > > > > > > You need to use another nested foreach statement. - > > C = foreach B { B1BAG = foreach A generate b.b1; D = distinct B1BAG; > generate flatten(group), COUNT(D);} > > -Thejas >
Thanks,but it is not support pig 0.8.0?
Thejas Nair 2012-03-15, 22:46
On 3/13/12 9:02 PM, guoyun wrote:
>> >> >> You need to use another nested foreach statement. - >> >> C = foreach B { B1BAG = foreach A generate b.b1; D = distinct B1BAG; >> generate flatten(group), COUNT(D);} >> >> -Thejas >> > > Thanks,but it is not support pig 0.8.0? > >
It should work in 0.8. Do you get some error ?
Thanks, Thejas
Dmitriy Ryaboy 2012-03-16, 02:02
Thejas, I don't think nested foreaches are in 8. They are only in trunk iirc.
On Thu, Mar 15, 2012 at 3:46 PM, Thejas Nair <[EMAIL PROTECTED]> wrote: > On 3/13/12 9:02 PM, guoyun wrote: > >>> >>> >>> You need to use another nested foreach statement. - >>> >>> C = foreach B { B1BAG = foreach A generate b.b1; D = distinct B1BAG; >>> generate flatten(group), COUNT(D);} >>> >>> -Thejas >>> >> >> Thanks,but it is not support pig 0.8.0? >> >> > > It should work in 0.8. Do you get some error ? > > Thanks, > Thejas
|
|