Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Please help with grouped count

Copy link to this message
Please help with grouped count
We have logs in the following format

us, foo
us, foo
fr, fizz
us, bar
fr, baz
fr, fizz
us, foo
fr, fizz

Where the first column is a country and the second column is a search term.

How in the world can I output the country followed by the top terms in
order of occurrence... ie:

us, (foo, bar)      # Top term for 'us' is foo then bar then ...
fr, (fizz, baz)      # Top term for 'fr' is fizz then baz then ...