Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Group by Fetching top 100 from each group

Copy link to this message
Re: Group by Fetching top 100 from each group

LIMIT and ORDER BY are both allowed nested ops for a FOREACH statement.
These should be able to do what you want.


B = GROUP A BY key
    X = ORDER A BY orderingParam;
    Y = LIMIT X 100;
    GENERATE group, Y;}


On Fri, Jun 29, 2012 at 04:19:18PM -0700, Benjamin Juhn wrote:
> Hi there,
> I'm trying to write a group by statement, only returning the top 100 records from each group.  Does pig support this?
> Thanks,
> Ben

Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3