|
|
-
Removing unwanted items in tuple
James Newhaven 2012-05-09, 17:43
I have a bag of tuples like this:
{ (product, unwanted, count), (product, unwanted, count) }
Is it possible in Pig to generate a new bag with a revised tuple structure with one of its columns removed?
The desired structure I want is:
{ (product, count), (product, count) }
Thanks, James
-
RE: Removing unwanted items in tuple
Steve Bernstein 2012-05-09, 18:47
FLATTEN() the bag, re-project (foreach/generate) leaving out the unwanted items, then group back together if you like.
_____________ Steve Bernstein VP, Analytics Rearden Commerce, Inc.
+1.408.499.0961 Mobile
deem.com | reardencommerce.com -----Original Message----- From: James Newhaven [mailto:[EMAIL PROTECTED]] Sent: Wednesday, May 09, 2012 10:43 AM To: [EMAIL PROTECTED] Subject: Removing unwanted items in tuple
I have a bag of tuples like this:
{ (product, unwanted, count), (product, unwanted, count) }
Is it possible in Pig to generate a new bag with a revised tuple structure with one of its columns removed?
The desired structure I want is:
{ (product, count), (product, count) }
Thanks, James
-
Re: Removing unwanted items in tuple
James Newhaven 2012-05-09, 20:22
Thanks. But what if my bag is grouped like this:
group: (id), bagname: { (product, unwanted, count), (product, unwanted, count) }
I want to retain the group tuple but remove the "unwanted" column from the bag.
I don't think I can perform a flatten here without losing the group tuple?
Thanks, James
On Wed, May 9, 2012 at 7:47 PM, Steve Bernstein <[EMAIL PROTECTED]>wrote:
> FLATTEN() the bag, re-project (foreach/generate) leaving out the unwanted > items, then group back together if you like. > > _____________ > Steve Bernstein > VP, Analytics > Rearden Commerce, Inc. > > +1.408.499.0961 Mobile > > deem.com | reardencommerce.com > > > -----Original Message----- > From: James Newhaven [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, May 09, 2012 10:43 AM > To: [EMAIL PROTECTED] > Subject: Removing unwanted items in tuple > > I have a bag of tuples like this: > > { (product, unwanted, count), (product, unwanted, count) } > > Is it possible in Pig to generate a new bag with a revised tuple structure > with one of its columns removed? > > The desired structure I want is: > > { (product, count), (product, count) } > > Thanks, > James >
-
Re: Removing unwanted items in tuple
Gianmarco De Francisci Mo... 2012-05-09, 21:09
You can use a nested foreach to project only the wanted columns.
Cheers, -- Gianmarco On Wed, May 9, 2012 at 10:22 PM, James Newhaven <[EMAIL PROTECTED]>wrote:
> Thanks. But what if my bag is grouped like this: > > group: (id), bagname: { (product, unwanted, count), (product, unwanted, > count) } > > I want to retain the group tuple but remove the "unwanted" column from the > bag. > > I don't think I can perform a flatten here without losing the group tuple? > > Thanks, > James > > On Wed, May 9, 2012 at 7:47 PM, Steve Bernstein <[EMAIL PROTECTED] > >wrote: > > > FLATTEN() the bag, re-project (foreach/generate) leaving out the unwanted > > items, then group back together if you like. > > > > _____________ > > Steve Bernstein > > VP, Analytics > > Rearden Commerce, Inc. > > > > +1.408.499.0961 Mobile > > > > deem.com | reardencommerce.com > > > > > > -----Original Message----- > > From: James Newhaven [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, May 09, 2012 10:43 AM > > To: [EMAIL PROTECTED] > > Subject: Removing unwanted items in tuple > > > > I have a bag of tuples like this: > > > > { (product, unwanted, count), (product, unwanted, count) } > > > > Is it possible in Pig to generate a new bag with a revised tuple > structure > > with one of its columns removed? > > > > The desired structure I want is: > > > > { (product, count), (product, count) } > > > > Thanks, > > James > > >
|
|