|
|
-
Map-only aggregationJie Li 2013-01-05, 03:46
Hi all,
Can Hive implement the aggregation as a Map-only job? As we know the data may be pre-partitioned via PARTITION-BY or CLUSTERED-BY, so we don't need the reduce phase to repartition the data. The Bucket Join seems to take advantage of the buckets for joins, so I wonder if there is some similar optimization for aggregations. Thanks, Jie |