|
|
+
Yaron Gonen 2012-08-05, 10:47
+
Harsh J 2012-08-05, 16:49
-
Re: Keeping Map-Tasks aliveYaron Gonen 2012-08-05, 18:41
Thanks for the fast reply, but I don't see how a custom record reader will
help. Consider again the k-means: the mappers need to stand-by until all the reducers finish to calculate the new clusters' center. Only then, after the reducers finish their work, the stand-by mappers get back to life and perform their work. On Sun, Aug 5, 2012 at 7:49 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Sure you can, as we provide pluggable code points via the API. Just write > a custom record reader that doubles the work (first round reads actual > input, second round reads your known output and reiterates). In the mapper, > separate the first and second logic via a flag. > > > On Sun, Aug 5, 2012 at 4:17 PM, Yaron Gonen <[EMAIL PROTECTED]> wrote: > >> Hi, >> Is there a way to keep a map-task alive after it has finished its work, >> to later perform another task on its same input? >> For example, consider the k-means clustering algorithm (k-means >> description <http://en.wikipedia.org/wiki/K-means_clustering> and hadoop >> implementation<http://codingwiththomas.blogspot.co.il/2011/05/k-means-clustering-with-mapreduce.html>). >> The only thing changing between iterations is the clusters centers. All the >> input points remain the same. Keeping the mapper alive, and performing the >> next round of map-tasks on the same node will save a lot of communication >> cost. >> >> Thanks, >> Yaron >> > > > > -- > Harsh J > +
Harsh J 2012-08-05, 22:21
+
Yaron Gonen 2012-08-06, 07:23
|