Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Question about Hadoop Default FCFS Job Scheduler


Copy link to this message
-
Re: Question about Hadoop Default FCFS Job Scheduler
Nan Zhu 2011-01-17, 16:46
OK, I got your point,

you mean why don't we put the for loop into obtainNewLocalMapTask(),

yes, I think we can do that, but the result is the same with current codes,
and I don't think it will lead too many benefits on performance, and
personally, I like the current style, :-)

Best,

Nan

On Tue, Jan 18, 2011 at 12:24 AM, He Chen <[EMAIL PROTECTED]> wrote:

> Hi Nan,
>
> Thank you for the reply. I understand what you mean. What I concern is
> inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a
> time.
>
> Now I understand why it only assigns one task at a time. It is because the
> outside loop:
>
> for (i = 0; i < MapperCapacity; ++i){
>
> (......)
>
> }
>
> I mean why this loop exists here. Why does the scheduler use this type of
> loop. It imposes overhead to the task assigning process if only assign one
> task at a time. It is obviously that a node can be assigned all available
> local tasks it can in one "afford obtainNewLocalMapTask(......)" method
> call.
>
> Bests
>
> Chen
>
> On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu <[EMAIL PROTECTED]> wrote:
>
> > Hi, Chen
> >
> > How is it going recently?
> >
> > Actually I think you misundertand the code in assignTasks() in
> > JobQueueTaskScheduler.java, see the following structure of the
> interesting
> > codes:
> >
> > //I'm sorry, I hacked the code so much, the name of the variables may be
> > different from the original version
> >
> > for (i = 0; i < MapperCapacity; ++i){
> >   ...
> >   for (JobInProgress job:jobQueue){
> >       //try to shedule a node-local or rack-local map tasks
> >       //here is the interesting place
> >       t = job.obtainNewLocalMapTask(...);
> >       if (t != null){
> >          ...
> >          break;//the break statement here will make the control flow back
> > to "for (job:jobQueue)" which means that it will restart map tasks
> > selection
> > procedure from the first job, so , it is actually schedule all of the
> first
> > job's local mappers first until the map slots are full
> >       }
> >   }
> > }
> >
> > BTW, we can only schedule a reduce task in a single heartbeat
> >
> >
> >
> > Best,
> > Nan
> > On Sat, Jan 15, 2011 at 1:45 PM, He Chen <[EMAIL PROTECTED]> wrote:
> >
> > > Hey all
> > >
> > > Why does the FCFS scheduler only let a node chooses one task at a time
> in
> > > one job? In order to increase the data locality,
> > > it is reasonable to let a node to choose all its local tasks (if it
> can)
> > > from a job at a time.
> > >
> > > Any reply will be appreciated.
> > >
> > > Thanks
> > >
> > > Chen
> > >
> >
>