Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: reducer tasks start time issue


+
Rishi Yadav 2012-12-22, 16:09
+
Lin Ma 2012-12-23, 15:09
+
Harsh J 2012-12-22, 16:15
Copy link to this message
-
Re: reducer tasks start time issue
Thanks for answering my question with not only the answer, but also
detailed description. :-)

regards,
Lin

On Sun, Dec 23, 2012 at 12:15 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> A reduce can't process the complete data set until it has fetched all
> partitions. And any map may produce a partition for any reducer.
> Hence, we generally wait before all maps have terminated, and their
> partition outputs ready and copied over to reduces, before we begin to
> group and process the keys.
>
> However, given that you began thinking about this, this paper on
> "Online" Hadoop may interest you:
> http://www.neilconway.org/docs/nsdi2010_hop.pdf
>
> On Sat, Dec 22, 2012 at 6:55 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > Hi guys,
> >
> > Supposing in a Hadoop job, there are both mappers and reducers. My
> question
> > is, reducer tasks cannot begin until all mapper tasks complete? If so,
> why
> > designed in this way?
> >
> > thanks in advance,
> > Lin
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB