Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Removing unwanted items in tuple


+
James Newhaven 2012-05-09, 17:43
+
Steve Bernstein 2012-05-09, 18:47
+
James Newhaven 2012-05-09, 20:22
Copy link to this message
-
Re: Removing unwanted items in tuple
You can use a nested foreach to project only the wanted columns.

Cheers,
--
Gianmarco
On Wed, May 9, 2012 at 10:22 PM, James Newhaven <[EMAIL PROTECTED]>wrote:

> Thanks. But what if my bag is grouped like this:
>
> group: (id), bagname: { (product, unwanted, count), (product, unwanted,
> count) }
>
> I want to retain the group tuple  but remove the "unwanted" column from the
> bag.
>
> I don't think I can perform a flatten here without losing the group tuple?
>
> Thanks,
> James
>
> On Wed, May 9, 2012 at 7:47 PM, Steve Bernstein <[EMAIL PROTECTED]
> >wrote:
>
> > FLATTEN() the bag, re-project (foreach/generate) leaving out the unwanted
> > items, then group back together if you like.
> >
> > _____________
> > Steve Bernstein
> > VP, Analytics
> > Rearden Commerce, Inc.
> >
> > +1.408.499.0961 Mobile
> >
> > deem.com | reardencommerce.com
> >
> >
> > -----Original Message-----
> > From: James Newhaven [mailto:[EMAIL PROTECTED]]
> > Sent: Wednesday, May 09, 2012 10:43 AM
> > To: [EMAIL PROTECTED]
> > Subject: Removing unwanted items in tuple
> >
> > I have a bag of tuples like this:
> >
> > { (product, unwanted, count), (product, unwanted, count) }
> >
> > Is it possible in Pig to generate a new bag with a revised tuple
> structure
> > with one of its columns removed?
> >
> > The desired structure I want is:
> >
> > { (product, count), (product, count) }
> >
> > Thanks,
> > James
> >
>