Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - question about combiner

Copy link to this message
question about combiner
Han JU 2013-05-10, 15:19

For a MapReduce job with lots of intermediate results between mapper and
reducer, I implement a combiner function with a more compact representation
of the result data and I verified the final result is good when using
combiner. But when I look at the job counter "FILE_BYTES_WRITTEN" or
"Reduce shuffle bytes", the number with combiner is twice bigger than
without combiner. In my comprehension, these two counters represent the
output size of mapper. And with a combiner, the size of mapper output
should decrease, but it's not the case here.

So it means that my combiner doesn't work and it actually increase the size
of mapper output?

*JU Han*

Software Engineer Intern @ KXEN Inc.
UTC   -  Université de Technologie de Compiègne
*     **GI06 - Fouille de Données et Décisionnel*

+33 0619608888