Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Ordering and limiting Tuples inside a Bag

Copy link to this message
RE: Ordering and limiting Tuples inside a Bag
You can.  Check out nested Foreach, order by then limit. (see, for example, http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html).

Steve Bernstein
VP, Analytics
Rearden Commerce, Inc.

+1.408.499.0961 Mobile

deem.com | reardencommerce.com

-----Original Message-----
From: James Newhaven [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, May 09, 2012 4:57 AM
Subject: Ordering and limiting Tuples inside a Bag


Another newbie Pig question.

If I have a relation with a structure like this: (city, { (productId, count), (product, count) }).

This relation tracks counts of products for each city. So a tuple containing a city name and then a bag of products each with an inventory count.

Is it possible in pig, to extract only the top 3 products with the highest counts for each city, ordered from highest to lowest?

Ideally, I would like the output to be like this:

(New York City, ((apples, 50), (oranges, 34), (pears, 23))) (Another City, ((oranges, 52), (pears, 32), (apples, 12)))