Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Ordering and limiting Tuples inside a Bag


Copy link to this message
-
Re: Ordering and limiting Tuples inside a Bag
Ok, figured out the nested foreach. Thanks for your help.

Regards,
James

On Wed, May 9, 2012 at 5:33 PM, James Newhaven <[EMAIL PROTECTED]>wrote:

> Thanks Steve,
>
> Yes I did discover nested foreach, but I can't get the syntax right. Can
> anyone help get me started on how it's meant to look?
>
> Regards,
> James
>
>
> On Wed, May 9, 2012 at 4:55 PM, Steve Bernstein <[EMAIL PROTECTED]>wrote:
>
>> You can.  Check out nested Foreach, order by then limit. (see, for
>> example,
>> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html).
>>
>> _____________
>> Steve Bernstein
>> VP, Analytics
>> Rearden Commerce, Inc.
>>
>> +1.408.499.0961 Mobile
>>
>> deem.com | reardencommerce.com
>>
>> -----Original Message-----
>> From: James Newhaven [mailto:[EMAIL PROTECTED]]
>> Sent: Wednesday, May 09, 2012 4:57 AM
>> To: [EMAIL PROTECTED]
>> Subject: Ordering and limiting Tuples inside a Bag
>>
>> Hi,
>>
>> Another newbie Pig question.
>>
>> If I have a relation with a structure like this: (city, { (productId,
>> count), (product, count) }).
>>
>> This relation tracks counts of products for each city. So a tuple
>> containing a city name and then a bag of products each with an inventory
>> count.
>>
>> Is it possible in pig, to extract only the top 3 products with the
>> highest counts for each city, ordered from highest to lowest?
>>
>> Ideally, I would like the output to be like this:
>>
>> (New York City, ((apples, 50), (oranges, 34), (pears, 23))) (Another
>> City, ((oranges, 52), (pears, 32), (apples, 12)))
>>
>> Thanks,
>> James
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB