Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: reducer tasks start time issue


+
Rishi Yadav 2012-12-22, 16:09
+
Lin Ma 2012-12-23, 15:09
+
Harsh J 2012-12-22, 16:15
Copy link to this message
-
Re: reducer tasks start time issue
Lin Ma 2012-12-23, 15:09
Thanks for answering my question with not only the answer, but also
detailed description. :-)

regards,
Lin

On Sun, Dec 23, 2012 at 12:15 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> A reduce can't process the complete data set until it has fetched all
> partitions. And any map may produce a partition for any reducer.
> Hence, we generally wait before all maps have terminated, and their
> partition outputs ready and copied over to reduces, before we begin to
> group and process the keys.
>
> However, given that you began thinking about this, this paper on
> "Online" Hadoop may interest you:
> http://www.neilconway.org/docs/nsdi2010_hop.pdf
>
> On Sat, Dec 22, 2012 at 6:55 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > Hi guys,
> >
> > Supposing in a Hadoop job, there are both mappers and reducers. My
> question
> > is, reducer tasks cannot begin until all mapper tasks complete? If so,
> why
> > designed in this way?
> >
> > thanks in advance,
> > Lin
>
>
>
> --
> Harsh J
>