Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Removing unwanted items in tuple


Copy link to this message
-
Re: Removing unwanted items in tuple
You can use a nested foreach to project only the wanted columns.

Cheers,
--
Gianmarco
On Wed, May 9, 2012 at 10:22 PM, James Newhaven <[EMAIL PROTECTED]>wrote:

> Thanks. But what if my bag is grouped like this:
>
> group: (id), bagname: { (product, unwanted, count), (product, unwanted,
> count) }
>
> I want to retain the group tuple  but remove the "unwanted" column from the
> bag.
>
> I don't think I can perform a flatten here without losing the group tuple?
>
> Thanks,
> James
>
> On Wed, May 9, 2012 at 7:47 PM, Steve Bernstein <[EMAIL PROTECTED]
> >wrote:
>
> > FLATTEN() the bag, re-project (foreach/generate) leaving out the unwanted
> > items, then group back together if you like.
> >
> > _____________
> > Steve Bernstein
> > VP, Analytics
> > Rearden Commerce, Inc.
> >
> > +1.408.499.0961 Mobile
> >
> > deem.com | reardencommerce.com
> >
> >
> > -----Original Message-----
> > From: James Newhaven [mailto:[EMAIL PROTECTED]]
> > Sent: Wednesday, May 09, 2012 10:43 AM
> > To: [EMAIL PROTECTED]
> > Subject: Removing unwanted items in tuple
> >
> > I have a bag of tuples like this:
> >
> > { (product, unwanted, count), (product, unwanted, count) }
> >
> > Is it possible in Pig to generate a new bag with a revised tuple
> structure
> > with one of its columns removed?
> >
> > The desired structure I want is:
> >
> > { (product, count), (product, count) }
> >
> > Thanks,
> > James
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB