On Mar 3, 2011, at 2:51 AM, Steve Loughran wrote:
> yes, but the problem is determining which one will fail. Ideally you should find the route cause, which is often some race condition or hardware fault. If it's the same server ever time, turn it off.
> You can play with the specex parameters, maybe change when they get kicked off. The assumption in the code is that the slowness is caused by H/W problems (especially HDD issues) and it tries to avoid duplicate work. If every Map was duplicated, you'd be doubling the effective cost of each query, and annoying everyone else in the cluster. Plus increased disk and network IO might slow things down.
> Look at the options, have a play and see. If it doesn't have the feature, you can always try coding it in -if the scheduler API lets it do it, you wont' be breaking anyone else's code.
Thanks. I'll take it under consideration. In my case, it would be really beneficial to duplicate the work. That task in question is a single task on a single node (numerous mappers feed data into a single reducer), so duplicating the reducer represents very will duplicated effort while mitigating a potential bottleneck in the job's performance since the job simply is not done until the single reducer finishes. I would really like to be able to do what I am suggesting, to duplicate the reducer and kill the clones after the winner finishes.
Keith Wiley [EMAIL PROTECTED] keithwiley.com music.keithwiley.com
"Luminous beings are we, not this crude matter."