Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Selecting a task for the tasktracker


Copy link to this message
-
Re: Selecting a task for the tasktracker
Yaron Gonen 2012-12-27, 21:18
Thanks a lot!
On Thu, Dec 27, 2012 at 8:11 PM, Vinod Kumar Vavilapalli <
[EMAIL PROTECTED]> wrote:

>
> On top of that, the message indicates that you need to have your scheduler
> class in the mapred package.
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Dec 27, 2012, at 7:38 AM, Hemanth Yamijala wrote:
>
> Hi,
>
> Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and
> trunk, the Mapreduce framework is completely revamped to Yarn (
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
> and you may need to look at different interfaces for building your own
> scheduler.
>
> In 1.0, the primary function of the TaskScheduler is the assignTasks
> method. Given a TaskTracker object as input, this method figures out how
> many free map and reduce slots exist in that particular tasktracker and
> selects one or more task that can be scheduled on it. Since task selection
> is the primary responsibility and the granularity is at a task level, the
> class is called TaskScheduler.
>
> The method of choosing a job and then a task within the job is customised
> by the different schedulers already present in Hadoop. Also, the core logic
> of selecting a map task with data locality optimizations is not implemented
> in the schedulers per se, but they rely on the JobInProgress object in
> MapReduce framework for achieving the same.
>
> To implement your own Scheduler, it may be best to look at the sources of
> existing schedulers: JobQueueTaskScheduler, CapacityTaskScheduler or
> FairScheduler.  In particular, the last two are in the contrib modules of
> mapreduce, and hence will be fairly independent to follow. Their build
> files will also tell you how to resolve any compile problems like the one
> you are facing.
>
> Thanks
> Hemanth
>
>
>
>
> On Thu, Dec 27, 2012 at 4:10 PM, Yaron Gonen <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>> If I understand correctly, the job scheduler (why is the class called
>> TaskScheduler?) is responsible for assigning the task whose split is as
>> close as possible to the tasktacker.
>>  Meaning that the job scheduler is responsible to two things:
>>
>>    1. Selecting a job.
>>    2. Once a job is selected, assign the closest task to the tasktracker
>>    that send the heartbeat.
>>
>> Is this correct?
>>
>> I want to write my own job scheduler to change the logic above, but it
>> says The type TaskScheduler is not visible.
>> How can I write my own scheduler?
>>
>> thanks
>>
>
>
>