Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> When reduce function is used as combiner?


Copy link to this message
-
When reduce function is used as combiner?
Hi guys,

When reduce function is used as combiner? It is used as combiner when the
iterable passed to reduce function is large? correct?

Is there any maximum size for that iterable? I mean for example if that
iterable size is more than 1000 then reduce function will be called more
than once for that key.

another question is when reduce function is used as combiner the Input Key,
Value and output Key, Value must be the same. correct? If it is different
what will happen? exception thrown at runtime?

Forth question is: lets say iterable size is very large so hadoop will add
output of reduce to iterable and pass it to reduce again with other values
that have not been processed. The question is when hadoop will now that
from that point output of reduce function should be written to HDFS as a
real output? When there is no more value to put into that iterable?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB