Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Hash aggregation experience


Copy link to this message
-
Hash aggregation experience
Hi all,

Has anyone tried the hash aggregation feature in pig 0.10 and seen any
performance improvement? Recently I'm benchmarking HashAgg and the combiner
to see whether we should use HashAgg more aggresively, given that it has
lower overhead then the combiner and more flexibility that it can
auto-disable itself while the combiner can't.

Some of my benchmark results can be found in
https://cwiki.apache.org/confluence/display/PIG/Pig+Performance+Optimization#PigPerformanceOptimization-HashAggvs.Combiner.
Any comment is appreciated!

Jie