-Re: How to Free-up a Map Slot without Killing the Entire Job?
Kun Ling 2013-06-29, 01:42
the "bin/hadoop" script provide an option to kill the task by running
"bin/hadoop job -kill-task <task-attempt-id>". It seems helpful to you.
Here is how the killTask works.
##1. JobClient tell JobTracker which task to kill
1.1. JobClient will recognize this command, and call
JobSubmissionProtocal.killTask(), it will ask JobTracker to kill the task.
##2 JobTracker ask TaskTracker to kill the task.
2.1. JobTracker will firstly check whether the cluster is in safemode,
and if the task is not in progress. If all is false, it will firstly check
the permission of the current user, and then call TaskInProgress.killTask().
2.2. There is a TreeMap object tasksToKill, which maintained by
TaskInProgress to store the task that is need to kill.
2.3. JobTracker will use getTasksToKill() to get a killTasksList, and
put them into the heartbeat actions, and sent it to taskTracker.
2.4. In TaskTracker, the offerService() loop will loop forever, and
will get the HeartbeatResponse by calling transmitHeartBeat() method, and
will process the response to get the action which JobTracker ask it to do,
of course the kill Task action is in it.
2.5. Since the killTask action is not LaunchTaskAction and
CommitTaskAction, it will be passed to the AddActionToCleanup(), and in
it,the killTaskAction's actionId will be used to put into the
allCleanupActions queue for process.
2.6. The TaskCleanupThread in TaskTracker will try to run
taskCleanup(), this method will call processKillTaskAction(), finally this
methods will call kill() method of the TaskInProgress Object, which will
turn the state of the task from RUNNING to KILLED_UNCLEAN, and it also ask
directoryCleanupThread to cleanup the directory and release the slot, and
finally notify the JobTracker using heartbeat.
## 3. After JobTracker knows that TaskTracker have killed the task, it
will ask the taskTracker to run clean-up task. It will remove the
3.1. JobTracker will get the KILLED_UNCLEAN status of the Task attempt,
and change the type of the task to task-cleanup task, and put the task in
the mapCleanupTasks or reduceCleanupTasks in JobInProcess object according
to the original task type. And the Tasklist will be passed to the
TaskTracker using heartbeat.
3.2. TaskTracker will Run the cleanup task, cleanup the temporary files
generated by the killed task attempt, and change the status of the cleanup
task to SUCCESSED, and report to JobTracker using heartbeat.
3.3 JobTracker get the heartbeat, and knows that the task have been
On Sat, Jun 29, 2013 at 5:13 AM, Sreejith Ramakrishnan <
[EMAIL PROTECTED]> wrote:
> I'm trying to implement a scheduler (EDF). The scheduler should be able to
> kill or free-up a running map slot so that it can be assigned as a map slot
> to another job.
> I did some looking around and found a kill() method in
> org.apache.hadoop.mapred.JobInProgress. But, this kills the entire job. I
> want the job to still be working after a map slot has been removed from it.
> Can you guys tell me the right method/class to use?