Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> GROUP ALL Partitioning

Copy link to this message
GROUP ALL Partitioning
Hi there,

Just curious, can anyone provide a quick explanation or link to the source
code of how Pig partitions data on a GROUP alias ALL operation?  We're
seeing some odd behaviour, likely caused by skew in our data, and was just
curious how Pig will partition groups to reducers if there's no group key.

We've gotten around this already by providing our own partition key to
reduce skew.


Mike Sukmanowsky

Product Lead, http://parse.ly
989 Avenue of the Americas, 3rd Floor
New York, NY  10018
p: +1 (416) 953-4248