Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Speculative Execution and Streaming

Copy link to this message
Re: Speculative Execution and Streaming

> Does anybody know whether or not speculative execution works with Hadoop
> streaming?
> If so, I have a script that does not appear to ever launch redundant mappers
> for the slow performers. This may be due to the fact that each mapper
> quickly reports (inaccurately) that it is 100% complete. I am using the
> NLineInputFormat and each mapper gets 17 lines of input. Each line requires
> a lot of computation. It appears that all 17 lines immediately get counted
> as being processed early on. Is there anyway to report or force accurate
> completion stats? Could this explain why speculative execution never gets
> triggered?

I am wondering if you are hitting

In M/R pipes jobs, the map task progress moves to 100% as soon as the
input is read, because the processing happens asynchronously. As
Sreekanth notes, this would result in speculation not working as