Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Keeping Map-Tasks alive

Yaron Gonen 2012-08-05, 10:47
Harsh J 2012-08-05, 16:49
Copy link to this message
Re: Keeping Map-Tasks alive
Thanks for the fast reply, but I don't see how a custom record reader will
Consider again the k-means: the mappers need to stand-by until all the
reducers finish to calculate the new clusters' center. Only then, after the
reducers finish their work, the stand-by mappers get back to life and
perform their work.

On Sun, Aug 5, 2012 at 7:49 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Sure you can, as we provide pluggable code points via the API. Just write
> a custom record reader that doubles the work (first round reads actual
> input, second round reads your known output and reiterates). In the mapper,
> separate the first and second logic via a flag.
> On Sun, Aug 5, 2012 at 4:17 PM, Yaron Gonen <[EMAIL PROTECTED]> wrote:
>> Hi,
>> Is there a way to keep a map-task alive after it has finished its work,
>> to later perform another task on its same input?
>> For example, consider the k-means clustering algorithm (k-means
>> description <http://en.wikipedia.org/wiki/K-means_clustering> and hadoop
>> implementation<http://codingwiththomas.blogspot.co.il/2011/05/k-means-clustering-with-mapreduce.html>).
>> The only thing changing between iterations is the clusters centers. All the
>> input points remain the same. Keeping the mapper alive, and performing the
>> next round of map-tasks on the same node will save a lot of communication
>> cost.
>> Thanks,
>> Yaron
> --
> Harsh J
Harsh J 2012-08-05, 22:21
Yaron Gonen 2012-08-06, 07:23