|
|
-
What's speculative tasks
Pedro Costa 2010-07-25, 14:03
Hi,
- In hadoop MR it's used the term speculative tasks. What is speculative tasks?
- During the execution of a MR test, when we don't have splits to attribute to reduce tasks, those reduce tasks will run? For example, if I set that will run 6 reduce tasks and I don't have splits during the running of the example, the reduce tasks will run? If so, where is verified that a reduce task has a split assigned? Thanks, -- Pedro
+
Pedro Costa 2010-07-25, 14:03
-
Re: What's speculative tasks
Hemanth Yamijala 2010-07-26, 04:21
Pedro,
On Sun, Jul 25, 2010 at 7:33 PM, Pedro Costa <[EMAIL PROTECTED]> wrote: > Hi, > > - In hadoop MR it's used the term speculative tasks. What is speculative > tasks? >
When the MR framework detects that some tasks are running slower than others in the job, it has an option to launch duplicates of those tasks on different nodes from the original ones, with the hope that they would complete sooner than the original slow tasks. The motivation for this feature is that it has been found that every job has 'stragglers' - a small percentage of tasks that are significantly slower than the rest of them and these slow down the overall execution time of the job. Typically these stragglers come around due to bad hardware.
> - During the execution of a MR test, when we don't have splits to attribute > to reduce tasks, those reduce tasks will run? For example, if I set that > will run 6 reduce tasks and I don't have splits during the running of the > example, the reduce tasks will run? If so, where is verified that a reduce > task has a split assigned?
Splits are related to map tasks, not reduce tasks. Reduce tasks get their input from the output of map tasks that is generated and stored in an intermediate fashion on the compute nodes. Can you clarify what you are looking for, with this context ?
Thanks Hemanth
+
Hemanth Yamijala 2010-07-26, 04:21
-
Re: What's speculative tasks
Pedro Costa 2010-07-26, 08:33
For the second question:
I'm running the wordcount example, bin/hadoop jar hadoop-0.20.2-examples.jar wordcount gutenberg gutenberg-output, but the directory gutenberg has no files. In my case, the execution of the program blocks and nothing is done. Should this happen? On Mon, Jul 26, 2010 at 5:21 AM, Hemanth Yamijala <[EMAIL PROTECTED]> wrote: > > Pedro, > > On Sun, Jul 25, 2010 at 7:33 PM, Pedro Costa <[EMAIL PROTECTED]> wrote: > > Hi, > > > > - In hadoop MR it's used the term speculative tasks. What is speculative > > tasks? > > > > When the MR framework detects that some tasks are running slower than > others in the job, it has an option to launch duplicates of those > tasks on different nodes from the original ones, with the hope that > they would complete sooner than the original slow tasks. The > motivation for this feature is that it has been found that every job > has 'stragglers' - a small percentage of tasks that are significantly > slower than the rest of them and these slow down the overall execution > time of the job. Typically these stragglers come around due to bad > hardware. > > > - During the execution of a MR test, when we don't have splits to attribute > > to reduce tasks, those reduce tasks will run? For example, if I set that > > will run 6 reduce tasks and I don't have splits during the running of the > > example, the reduce tasks will run? If so, where is verified that a reduce > > task has a split assigned? > > Splits are related to map tasks, not reduce tasks. Reduce tasks get > their input from the output of map tasks that is generated and stored > in an intermediate fashion on the compute nodes. Can you clarify what > you are looking for, with this context ? > > Thanks > Hemanth
-- Pedro
+
Pedro Costa 2010-07-26, 08:33
|
|