Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> When reduce tasks start in MapReduce Streaming?


Copy link to this message
-
When reduce tasks start in MapReduce Streaming?
Hi,

I read from documents that in MapReduce, the reduce tasks only start
after a percentage (by default 90%) of maps end. This means that the
slowest maps can delay the start of reduce tasks, and the input data
that is consumed by the reduce tasks is represented as a batch of
data. This means that, the scenario of having reduce tasks consuming
data as long the map tasks produce it, doesn't exist. But with the in
Hadoop MapReduce streaming this still happens?

--
Best regards,
P
+
Jeff Bean 2013-01-16, 05:41
+
Pedro Sá da Costa 2013-01-16, 09:04
+
Jeff Bean 2013-01-16, 09:20
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB