-Re: Questions with regard to scheduling of map and reduce tasks
> 0.23.1 with Pig 0.10.0 on top.
> How is the preemption suppose to work? Is a single reducer suppose to
> be preempted or will a batch of reducers be preempted.
A batch of reducers. Enough reducers will be killed to accommodate any/all pending map-tasks.
> Also, when you
> say preemption, do you mean that the current execution of a reducer is
> actually paused and resumed again later. Or, does preemption mean that
> the reducer's container is discarded and must be started again from
No, by preempted, I mean that the current reduce tasks are killed. And because MapReduce tolerates arbitrary number of killed task-attempts (as opposed to failed task-attempts), this is okay. So yes, the reducers when they get rescheduled will start all-over again.
> Do you know of any doc on the specifics of task scheduling? Would you
> say that the example I gave is in line with how scheduling is
We don't have docs on task-level scheduling, but you can look at RMContainerAllocator.java and related classes in MRAppMaster (i.e. hadoop-mapreduce-client-app/ module) for understanding this.
And no, like I mentioned before scheduling isn't random, but maps first, and a slow reduce ramp-up as reducers finish.
> FYI: the starvation issue is a known bug (https://issues.apache.org/jira/browse/MAPREDUCE-4299).
Mistook that you were using capacity-scheduler. There were other such bugs in both the Fifo and capacity-schedulers which got fixed (not sure of fixed-version). We've tested Capacity-scheduler a lot more if you pick up the latest version - 0.23.2/branch-0.23
+Vinod Kumar Vavilapalli