Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Combiners

Copy link to this message
Re: Combiners
Owen O'Malley 2011-10-31, 03:22
On Sat, Oct 29, 2011 at 3:52 AM, Mathias Herberts <

> My question is, what happens if the combiner outputs different keys
> than what it is being fed? The output of the combiner will suffer two
> flaws:
> 1. It won't be sorted
> 2. It might end up in the wrong partition

Yes. We've talked about adding various checks, but I don't think anyone has
added them. We obviously have the input key and one option would be to
ignore the output key.
> Since a Combiner is simply a Reducer with no other constraints,

That isn't true. Combiners are required to be:
  1. Idempotent - The number of times the combiner is applied can't change
the output
  2. Transititive -  The order of the inputs can't change the output
  3. Side-effect free - Combiners can't have side effects (or they won't be
  4. Preserve the sort order - They can't change the keys to disrupt the
sort order
  5. Preserve the partitioning - They can't change the keys to change the

All 5 of them are required for combiners.

-- Owen