Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Speculative Execution and Streaming


+
Gregory Lawrence 2010-05-27, 20:37
+
Rekha Joshi 2010-05-28, 04:14
+
Gregory Lawrence 2010-05-28, 17:11
Copy link to this message
-
Re: Speculative Execution and Streaming
Hemanth Yamijala 2010-05-28, 09:57
Greg,

> Does anybody know whether or not speculative execution works with Hadoop
> streaming?
>
> If so, I have a script that does not appear to ever launch redundant mappers
> for the slow performers. This may be due to the fact that each mapper
> quickly reports (inaccurately) that it is 100% complete. I am using the
> NLineInputFormat and each mapper gets 17 lines of input. Each line requires
> a lot of computation. It appears that all 17 lines immediately get counted
> as being processed early on. Is there anyway to report or force accurate
> completion stats? Could this explain why speculative execution never gets
> triggered?
>

I am wondering if you are hitting
https://issues.apache.org/jira/browse/MAPREDUCE-1073.

In M/R pipes jobs, the map task progress moves to 100% as soon as the
input is read, because the processing happens asynchronously. As
Sreekanth notes, this would result in speculation not working as
expected.

Thanks
Hemanth