|
|
+
Gregory Lawrence 2010-05-27, 20:37
+
Rekha Joshi 2010-05-28, 04:14
+
Gregory Lawrence 2010-05-28, 17:11
-
Re: Speculative Execution and StreamingHemanth Yamijala 2010-05-28, 09:57
Greg,
> Does anybody know whether or not speculative execution works with Hadoop > streaming? > > If so, I have a script that does not appear to ever launch redundant mappers > for the slow performers. This may be due to the fact that each mapper > quickly reports (inaccurately) that it is 100% complete. I am using the > NLineInputFormat and each mapper gets 17 lines of input. Each line requires > a lot of computation. It appears that all 17 lines immediately get counted > as being processed early on. Is there anyway to report or force accurate > completion stats? Could this explain why speculative execution never gets > triggered? > I am wondering if you are hitting https://issues.apache.org/jira/browse/MAPREDUCE-1073. In M/R pipes jobs, the map task progress moves to 100% as soon as the input is read, because the processing happens asynchronously. As Sreekanth notes, this would result in speculation not working as expected. Thanks Hemanth |