Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - GROUP ALL Partitioning


Copy link to this message
-
GROUP ALL Partitioning
Mike Sukmanowsky 2014-01-23, 19:38
Hi there,

Just curious, can anyone provide a quick explanation or link to the source
code of how Pig partitions data on a GROUP alias ALL operation?  We're
seeing some odd behaviour, likely caused by skew in our data, and was just
curious how Pig will partition groups to reducers if there's no group key.

We've gotten around this already by providing our own partition key to
reduce skew.

Mike

--
Mike Sukmanowsky

Product Lead, http://parse.ly
989 Avenue of the Americas, 3rd Floor
New York, NY  10018
p: +1 (416) 953-4248
e: [EMAIL PROTECTED]