Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Ordering and limiting Tuples inside a Bag


Copy link to this message
-
Re: Ordering and limiting Tuples inside a Bag
Thanks Steve,

Yes I did discover nested foreach, but I can't get the syntax right. Can
anyone help get me started on how it's meant to look?

Regards,
James

On Wed, May 9, 2012 at 4:55 PM, Steve Bernstein <[EMAIL PROTECTED]>wrote:

> You can.  Check out nested Foreach, order by then limit. (see, for
> example,
> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html).
>
> _____________
> Steve Bernstein
> VP, Analytics
> Rearden Commerce, Inc.
>
> +1.408.499.0961 Mobile
>
> deem.com | reardencommerce.com
>
> -----Original Message-----
> From: James Newhaven [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, May 09, 2012 4:57 AM
> To: [EMAIL PROTECTED]
> Subject: Ordering and limiting Tuples inside a Bag
>
> Hi,
>
> Another newbie Pig question.
>
> If I have a relation with a structure like this: (city, { (productId,
> count), (product, count) }).
>
> This relation tracks counts of products for each city. So a tuple
> containing a city name and then a bag of products each with an inventory
> count.
>
> Is it possible in pig, to extract only the top 3 products with the highest
> counts for each city, ordered from highest to lowest?
>
> Ideally, I would like the output to be like this:
>
> (New York City, ((apples, 50), (oranges, 34), (pears, 23))) (Another City,
> ((oranges, 52), (pears, 32), (apples, 12)))
>
> Thanks,
> James
>