Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Ordering and limiting Tuples inside a Bag


Copy link to this message
-
Re: Ordering and limiting Tuples inside a Bag
Ok, figured out the nested foreach. Thanks for your help.

Regards,
James

On Wed, May 9, 2012 at 5:33 PM, James Newhaven <[EMAIL PROTECTED]>wrote:

> Thanks Steve,
>
> Yes I did discover nested foreach, but I can't get the syntax right. Can
> anyone help get me started on how it's meant to look?
>
> Regards,
> James
>
>
> On Wed, May 9, 2012 at 4:55 PM, Steve Bernstein <[EMAIL PROTECTED]>wrote:
>
>> You can.  Check out nested Foreach, order by then limit. (see, for
>> example,
>> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html).
>>
>> _____________
>> Steve Bernstein
>> VP, Analytics
>> Rearden Commerce, Inc.
>>
>> +1.408.499.0961 Mobile
>>
>> deem.com | reardencommerce.com
>>
>> -----Original Message-----
>> From: James Newhaven [mailto:[EMAIL PROTECTED]]
>> Sent: Wednesday, May 09, 2012 4:57 AM
>> To: [EMAIL PROTECTED]
>> Subject: Ordering and limiting Tuples inside a Bag
>>
>> Hi,
>>
>> Another newbie Pig question.
>>
>> If I have a relation with a structure like this: (city, { (productId,
>> count), (product, count) }).
>>
>> This relation tracks counts of products for each city. So a tuple
>> containing a city name and then a bag of products each with an inventory
>> count.
>>
>> Is it possible in pig, to extract only the top 3 products with the
>> highest counts for each city, ordered from highest to lowest?
>>
>> Ideally, I would like the output to be like this:
>>
>> (New York City, ((apples, 50), (oranges, 34), (pears, 23))) (Another
>> City, ((oranges, 52), (pears, 32), (apples, 12)))
>>
>> Thanks,
>> James
>>
>
>