MapReduce, mail # dev - Re: Launching Tasks From JobTracker

Re: Launching Tasks From JobTracker
Hemanth Yamijala 2010-09-10, 13:16
[Moving to mapreduce-dev, copying common-dev]


On Thu, Sep 9, 2010 at 11:30 AM, radheshyam nanduri
> Hi,
> I am working on writing a scheduler plugin for Hadoop.

Currently, the model supported to plug-in schedulers to Hadoop is to
extend the TaskScheduler class in o.a.h.mapred package. Primarily what
a 'plug-in' scheduler can do is that given a set of jobs and a
tasktracker, it can assign one or more suitable tasks to the
tasktracker. The scheduler will have flexibility in choosing the job
and the tasks it wants to schedule. You can take a look at some of the
existing schedulers like CapacityTaskScheduler or FairScheduler to see
what they do and how.

> I have divided the job received into number of tasks.

This is already done in the framework when a job is submitted. Are you
overriding this ? Can you explain what you are doing in some more
detail ?

> My task now is to assign a task on to a particular TaskTracker.
> I want to start the Task right away with a method which accepts the Task and
> TaskTracker as arguments.

I am not sure I am following this. A task needs to be sent via
Hadoop's RPC mechanisms to a tasktracker where it should be executed.
So, conceptually, it is the tasktracker that has an RPC method which
accepts tasks to launch. The task is launched typically straight-away,
but in case of certain scheduling choices, it could have to wait for a
short while to get a free slot to execute.

> Could you guide me on doing this.

It may be better if you can describe what you want the plug-in
scheduler to achieve.


> Thanks in advance.
> --
> Radheshyam Nanduri