Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> What's speculative tasks

Pedro Costa 2010-07-25, 14:03
Copy link to this message
Re: What's speculative tasks

On Sun, Jul 25, 2010 at 7:33 PM, Pedro Costa <[EMAIL PROTECTED]> wrote:
> Hi,
> - In hadoop MR it's used the term speculative tasks. What is speculative
> tasks?

When the MR framework detects that some tasks are running slower than
others in the job, it has an option to launch duplicates of those
tasks on different nodes from the original ones, with the hope that
they would complete sooner than the original slow tasks. The
motivation for this feature is that it has been found that every job
has 'stragglers' - a small percentage of tasks that are significantly
slower than the rest of them and these slow down the overall execution
time of the job. Typically these stragglers come around due to bad

> - During the execution of a MR test, when we don't have splits to attribute
> to reduce tasks, those reduce tasks will run? For example, if I set that
> will run 6 reduce tasks and I don't have splits during the running of the
> example, the reduce tasks will run? If so, where is verified that a reduce
> task has a split assigned?

Splits are related to map tasks, not reduce tasks. Reduce tasks get
their input from the output of map tasks that is generated and stored
in an intermediate fashion on the compute nodes. Can you clarify what
you are looking for, with this context ?

Pedro Costa 2010-07-26, 08:33