Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Difference between combiner and aggregator


Copy link to this message
-
Difference between combiner and aggregator
jamal sasha 2013-04-05, 20:30
Hi,
 I am trying to understand the difference between combiner and aggregator.

Based on my readings:
Wordcount example (mapper)

aggregator
class Mapper
  method MAP
  H <-- Associative array
  for all term t in document:
      H{t} = H{t} + 1
  for all term t ele H do
      EMIT(term t, count H{t})
combiner:
class Mapper
 method INITIALIZE
  H <-- Associative array
  method MAP
  for all term t in document:
      H{t} = H{t} + 1
 method CLOSE
  for all term t ele H do
      EMIT(term t, count H{t})

So, second method is how combiner is implemented.
But 1 seems much simpler?
What are the gains I get using combiner instead of local aggregations?