Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Keeping Map-Tasks alive


Copy link to this message
-
Keeping Map-Tasks alive
Hi,
Is there a way to keep a map-task alive after it has finished its work, to
later perform another task on its same input?
For example, consider the k-means clustering algorithm (k-means
description<http://en.wikipedia.org/wiki/K-means_clustering>and hadoop
implementation<http://codingwiththomas.blogspot.co.il/2011/05/k-means-clustering-with-mapreduce.html>).
The only thing changing between iterations is the clusters centers. All the
input points remain the same. Keeping the mapper alive, and performing the
next round of map-tasks on the same node will save a lot of communication
cost.

Thanks,
Yaron
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB