Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Ordering and limiting Tuples inside a Bag


Copy link to this message
-
RE: Ordering and limiting Tuples inside a Bag
You can.  Check out nested Foreach, order by then limit. (see, for example, http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html).

_____________
Steve Bernstein
VP, Analytics
Rearden Commerce, Inc.

+1.408.499.0961 Mobile

deem.com | reardencommerce.com

-----Original Message-----
From: James Newhaven [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, May 09, 2012 4:57 AM
To: [EMAIL PROTECTED]
Subject: Ordering and limiting Tuples inside a Bag

Hi,

Another newbie Pig question.

If I have a relation with a structure like this: (city, { (productId, count), (product, count) }).

This relation tracks counts of products for each city. So a tuple containing a city name and then a bag of products each with an inventory count.

Is it possible in pig, to extract only the top 3 products with the highest counts for each city, ordered from highest to lowest?

Ideally, I would like the output to be like this:

(New York City, ((apples, 50), (oranges, 34), (pears, 23))) (Another City, ((oranges, 52), (pears, 32), (apples, 12)))

Thanks,
James
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB