|
|
-
Hadoop's deafult FIFO scheduler
He Chen 2010-10-14, 17:06
Hi all
I am testing the performance of my Hadoop clsuters with Hadoop Default FIFO schedular. But I find a interesting phenomina.
When I submit a series of jobs, some job will be executed earlier even they are submitted late. All jobs are request same amount of blocks. For example: job 1 submit at time 0 job 2 submit at time 1 job 3 submit at time 2 job 4 submit at time 3 job 4 's queue time is smaller than job3's queue time. This disobey the FIFO principle. Any one can give a hint?
Thanks
Chen
-
Re: Hadoop's deafult FIFO scheduler
abhishek sharma 2010-10-14, 17:10
What is the inter-arrival time between these jobs?
There is a "set up" phase for jobs before they are launched. It is possible that the order of jobs can change due to slightly different set up times. Apart from the number of blocks, it may matter "where" these blocks lie.
Abhishek
On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > Hi all > > I am testing the performance of my Hadoop clsuters with Hadoop Default FIFO > schedular. But I find a interesting phenomina. > > When I submit a series of jobs, some job will be executed earlier even they > are submitted late. All jobs are request same amount of blocks. For example: > job 1 submit at time 0 > job 2 submit at time 1 > job 3 submit at time 2 > job 4 submit at time 3 > > > job 4 's queue time is smaller than job3's queue time. This disobey the FIFO > principle. Any one can give a hint? > > Thanks > > Chen >
-
Re: Hadoop's deafult FIFO scheduler
He Chen 2010-10-14, 17:29
they arrived in 1 minute. I understand there will be a setup phase which will use any free slot no matter map or reduce.
My queue time is the period between the start of Map stage and the time job is submitted. Because the setup phase has the higher priority than map and reduce tasks. Any job submitted in the queue will setup no matter how many previous map and reduce tasks need to be assigned.
Now, I am sure the job3 setup stage finished earlier than job4's. However, job3's map stage start later than job4's. BTW, they request same amount of blocks. On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> wrote:
> What is the inter-arrival time between these jobs? > > There is a "set up" phase for jobs before they are launched. It is > possible that the order of jobs can change due to slightly different > set up times. Apart from the number of blocks, it may matter "where" > these blocks lie. > > Abhishek > > On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > > Hi all > > > > I am testing the performance of my Hadoop clsuters with Hadoop Default > FIFO > > schedular. But I find a interesting phenomina. > > > > When I submit a series of jobs, some job will be executed earlier even > they > > are submitted late. All jobs are request same amount of blocks. For > example: > > job 1 submit at time 0 > > job 2 submit at time 1 > > job 3 submit at time 2 > > job 4 submit at time 3 > > > > > > job 4 's queue time is smaller than job3's queue time. This disobey the > FIFO > > principle. Any one can give a hint? > > > > Thanks > > > > Chen > > >
-
Re: Hadoop's deafult FIFO scheduler
Nan Zhu 2010-10-14, 17:50
Hi, Chen
I think it's due to the disk/network performance, I mean the speed of reading the content on disk/network into the local memory
if job3 hasn't complete data to start mappers, but job4 does, the scheduler would select the tasks of job4 from the list to run firstly,
I think the so called FIFO principle is intended to setup stage, the firstly arrived job would be setup firstly
Nan
On Fri, Oct 15, 2010 at 1:29 AM, He Chen <[EMAIL PROTECTED]> wrote:
> they arrived in 1 minute. I understand there will be a setup phase which > will use any free slot no matter map or reduce. > > My queue time is the period between the start of Map stage and the time job > is submitted. Because the setup phase has the higher priority than map and > reduce tasks. Any job submitted in the queue will setup no matter how many > previous map and reduce tasks need to be assigned. > > Now, I am sure the job3 setup stage finished earlier than job4's. However, > job3's map stage start later than job4's. BTW, they request same amount of > blocks. > > > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> > wrote: > > > What is the inter-arrival time between these jobs? > > > > There is a "set up" phase for jobs before they are launched. It is > > possible that the order of jobs can change due to slightly different > > set up times. Apart from the number of blocks, it may matter "where" > > these blocks lie. > > > > Abhishek > > > > On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > > > Hi all > > > > > > I am testing the performance of my Hadoop clsuters with Hadoop Default > > FIFO > > > schedular. But I find a interesting phenomina. > > > > > > When I submit a series of jobs, some job will be executed earlier even > > they > > > are submitted late. All jobs are request same amount of blocks. For > > example: > > > job 1 submit at time 0 > > > job 2 submit at time 1 > > > job 3 submit at time 2 > > > job 4 submit at time 3 > > > > > > > > > job 4 's queue time is smaller than job3's queue time. This disobey the > > FIFO > > > principle. Any one can give a hint? > > > > > > Thanks > > > > > > Chen > > > > > >
-
Re: Hadoop's deafult FIFO scheduler
He Chen 2010-10-14, 18:15
I doubt this. According to the execution record. the disk and network utilization are the same.
At last, if the FCFS scheduler only apply for the setup stage which takes about several second. This does not mean anything. users who submit their jobs earlier only ge job setup earlier, but may be delayed for any possible hardware and software reason? Acutally, most of the job can be finished following the FIFO sequence. What I am curious is about why some one get in the FIFO queue at the start point. What is the reason?
On Thu, Oct 14, 2010 at 12:50 PM, Nan Zhu <[EMAIL PROTECTED]> wrote:
> Hi, Chen > > I think it's due to the disk/network performance, I mean the speed of > reading the content on disk/network into the local memory > > if job3 hasn't complete data to start mappers, but job4 does, the > scheduler > would select the tasks of job4 from the list to run firstly, > > I think the so called FIFO principle is intended to setup stage, the > firstly > arrived job would be setup firstly > > Nan > > > > On Fri, Oct 15, 2010 at 1:29 AM, He Chen <[EMAIL PROTECTED]> wrote: > > > they arrived in 1 minute. I understand there will be a setup phase which > > will use any free slot no matter map or reduce. > > > > My queue time is the period between the start of Map stage and the time > job > > is submitted. Because the setup phase has the higher priority than map > and > > reduce tasks. Any job submitted in the queue will setup no matter how > many > > previous map and reduce tasks need to be assigned. > > > > Now, I am sure the job3 setup stage finished earlier than job4's. > However, > > job3's map stage start later than job4's. BTW, they request same amount > of > > blocks. > > > > > > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> > > wrote: > > > > > What is the inter-arrival time between these jobs? > > > > > > There is a "set up" phase for jobs before they are launched. It is > > > possible that the order of jobs can change due to slightly different > > > set up times. Apart from the number of blocks, it may matter "where" > > > these blocks lie. > > > > > > Abhishek > > > > > > On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > > > > Hi all > > > > > > > > I am testing the performance of my Hadoop clsuters with Hadoop > Default > > > FIFO > > > > schedular. But I find a interesting phenomina. > > > > > > > > When I submit a series of jobs, some job will be executed earlier > even > > > they > > > > are submitted late. All jobs are request same amount of blocks. For > > > example: > > > > job 1 submit at time 0 > > > > job 2 submit at time 1 > > > > job 3 submit at time 2 > > > > job 4 submit at time 3 > > > > > > > > > > > > job 4 's queue time is smaller than job3's queue time. This disobey > the > > > FIFO > > > > principle. Any one can give a hint? > > > > > > > > Thanks > > > > > > > > Chen > > > > > > > > > >
-
Re: Hadoop's deafult FIFO scheduler
Hemanth Yamijala 2010-10-15, 04:45
Hi,
On Thu, Oct 14, 2010 at 10:59 PM, He Chen <[EMAIL PROTECTED]> wrote: > they arrived in 1 minute. I understand there will be a setup phase which > will use any free slot no matter map or reduce. >
You mean all jobs were submitted within a minute ? That means a few seconds between jobs ? Or do you mean each job was submitted a minute after the earlier job. Also, which version of Hadoop is this ?
> My queue time is the period between the start of Map stage and the time job > is submitted. Because the setup phase has the higher priority than map and > reduce tasks. Any job submitted in the queue will setup no matter how many > previous map and reduce tasks need to be assigned. > > Now, I am sure the job3 setup stage finished earlier than job4's. However, > job3's map stage start later than job4's. BTW, they request same amount of > blocks. > > > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> wrote: > >> What is the inter-arrival time between these jobs? >> >> There is a "set up" phase for jobs before they are launched. It is >> possible that the order of jobs can change due to slightly different >> set up times. Apart from the number of blocks, it may matter "where" >> these blocks lie. >> >> Abhishek >> >> On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: >> > Hi all >> > >> > I am testing the performance of my Hadoop clsuters with Hadoop Default >> FIFO >> > schedular. But I find a interesting phenomina. >> > >> > When I submit a series of jobs, some job will be executed earlier even >> they >> > are submitted late. All jobs are request same amount of blocks. For >> example: >> > job 1 submit at time 0 >> > job 2 submit at time 1 >> > job 3 submit at time 2 >> > job 4 submit at time 3 >> > >> > >> > job 4 's queue time is smaller than job3's queue time. This disobey the >> FIFO >> > principle. Any one can give a hint? >> > >> > Thanks >> > >> > Chen >> > >> >
-
Re: Hadoop's deafult FIFO scheduler
He Chen 2010-10-15, 10:11
Hi Hemanth
all jobs were submitted within a minute and a few seconds between jobs. The hadoop version is 0.20.2
Thanks On Thu, Oct 14, 2010 at 11:45 PM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
> Hi, > > On Thu, Oct 14, 2010 at 10:59 PM, He Chen <[EMAIL PROTECTED]> wrote: > > they arrived in 1 minute. I understand there will be a setup phase which > > will use any free slot no matter map or reduce. > > > > You mean all jobs were submitted within a minute ? That means a few > seconds between jobs ? Or do you mean each job was submitted a minute > after the earlier job. Also, which version of Hadoop is this ? > > > My queue time is the period between the start of Map stage and the time > job > > is submitted. Because the setup phase has the higher priority than map > and > > reduce tasks. Any job submitted in the queue will setup no matter how > many > > previous map and reduce tasks need to be assigned. > > > > Now, I am sure the job3 setup stage finished earlier than job4's. > However, > > job3's map stage start later than job4's. BTW, they request same amount > of > > blocks. > > > > > > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> > wrote: > > > >> What is the inter-arrival time between these jobs? > >> > >> There is a "set up" phase for jobs before they are launched. It is > >> possible that the order of jobs can change due to slightly different > >> set up times. Apart from the number of blocks, it may matter "where" > >> these blocks lie. > >> > >> Abhishek > >> > >> On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > >> > Hi all > >> > > >> > I am testing the performance of my Hadoop clsuters with Hadoop Default > >> FIFO > >> > schedular. But I find a interesting phenomina. > >> > > >> > When I submit a series of jobs, some job will be executed earlier even > >> they > >> > are submitted late. All jobs are request same amount of blocks. For > >> example: > >> > job 1 submit at time 0 > >> > job 2 submit at time 1 > >> > job 3 submit at time 2 > >> > job 4 submit at time 3 > >> > > >> > > >> > job 4 's queue time is smaller than job3's queue time. This disobey > the > >> FIFO > >> > principle. Any one can give a hint? > >> > > >> > Thanks > >> > > >> > Chen > >> > > >> > > >
-
Re: Hadoop's deafult FIFO scheduler
Hemanth Yamijala 2010-10-23, 03:12
Hi,
Sorry for a delayed response.
Once jobs are submitted, they are setup by running the setup task. These are run in order of submission. However, the setup task itself runs on any free map or reduce slot on any node. I can imagine scenarios where the setup task of a job that was submitted later completes first. And when that happens, it can start running out of order.
Thanks Hemanth
On Fri, Oct 15, 2010 at 3:41 PM, He Chen <[EMAIL PROTECTED]> wrote: > Hi Hemanth > > all jobs were submitted within a minute and a few > seconds between jobs. The hadoop version is 0.20.2 > > Thanks > > > On Thu, Oct 14, 2010 at 11:45 PM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> On Thu, Oct 14, 2010 at 10:59 PM, He Chen <[EMAIL PROTECTED]> wrote: >> > they arrived in 1 minute. I understand there will be a setup phase which >> > will use any free slot no matter map or reduce. >> > >> >> You mean all jobs were submitted within a minute ? That means a few >> seconds between jobs ? Or do you mean each job was submitted a minute >> after the earlier job. Also, which version of Hadoop is this ? >> >> > My queue time is the period between the start of Map stage and the time >> job >> > is submitted. Because the setup phase has the higher priority than map >> and >> > reduce tasks. Any job submitted in the queue will setup no matter how >> many >> > previous map and reduce tasks need to be assigned. >> > >> > Now, I am sure the job3 setup stage finished earlier than job4's. >> However, >> > job3's map stage start later than job4's. BTW, they request same amount >> of >> > blocks. >> > >> > >> > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> >> wrote: >> > >> >> What is the inter-arrival time between these jobs? >> >> >> >> There is a "set up" phase for jobs before they are launched. It is >> >> possible that the order of jobs can change due to slightly different >> >> set up times. Apart from the number of blocks, it may matter "where" >> >> these blocks lie. >> >> >> >> Abhishek >> >> >> >> On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: >> >> > Hi all >> >> > >> >> > I am testing the performance of my Hadoop clsuters with Hadoop Default >> >> FIFO >> >> > schedular. But I find a interesting phenomina. >> >> > >> >> > When I submit a series of jobs, some job will be executed earlier even >> >> they >> >> > are submitted late. All jobs are request same amount of blocks. For >> >> example: >> >> > job 1 submit at time 0 >> >> > job 2 submit at time 1 >> >> > job 3 submit at time 2 >> >> > job 4 submit at time 3 >> >> > >> >> > >> >> > job 4 's queue time is smaller than job3's queue time. This disobey >> the >> >> FIFO >> >> > principle. Any one can give a hint? >> >> > >> >> > Thanks >> >> > >> >> > Chen >> >> > >> >> >> > >> >
-
Re: Hadoop's deafult FIFO scheduler
He Chen 2010-10-23, 13:04
I checked the job history, The setup process finished in order.
On Fri, Oct 22, 2010 at 10:12 PM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
> Hi, > > Sorry for a delayed response. > > Once jobs are submitted, they are setup by running the setup task. > These are run in order of submission. However, the setup task itself > runs on any free map or reduce slot on any node. I can imagine > scenarios where the setup task of a job that was submitted later > completes first. And when that happens, it can start running out of > order. > > Thanks > Hemanth > > On Fri, Oct 15, 2010 at 3:41 PM, He Chen <[EMAIL PROTECTED]> wrote: > > Hi Hemanth > > > > all jobs were submitted within a minute and a few > > seconds between jobs. The hadoop version is 0.20.2 > > > > Thanks > > > > > > On Thu, Oct 14, 2010 at 11:45 PM, Hemanth Yamijala <[EMAIL PROTECTED] > >wrote: > > > >> Hi, > >> > >> On Thu, Oct 14, 2010 at 10:59 PM, He Chen <[EMAIL PROTECTED]> wrote: > >> > they arrived in 1 minute. I understand there will be a setup phase > which > >> > will use any free slot no matter map or reduce. > >> > > >> > >> You mean all jobs were submitted within a minute ? That means a few > >> seconds between jobs ? Or do you mean each job was submitted a minute > >> after the earlier job. Also, which version of Hadoop is this ? > >> > >> > My queue time is the period between the start of Map stage and the > time > >> job > >> > is submitted. Because the setup phase has the higher priority than map > >> and > >> > reduce tasks. Any job submitted in the queue will setup no matter how > >> many > >> > previous map and reduce tasks need to be assigned. > >> > > >> > Now, I am sure the job3 setup stage finished earlier than job4's. > >> However, > >> > job3's map stage start later than job4's. BTW, they request same > amount > >> of > >> > blocks. > >> > > >> > > >> > On Thu, Oct 14, 2010 at 12:10 PM, abhishek sharma <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> What is the inter-arrival time between these jobs? > >> >> > >> >> There is a "set up" phase for jobs before they are launched. It is > >> >> possible that the order of jobs can change due to slightly different > >> >> set up times. Apart from the number of blocks, it may matter "where" > >> >> these blocks lie. > >> >> > >> >> Abhishek > >> >> > >> >> On Thu, Oct 14, 2010 at 10:06 AM, He Chen <[EMAIL PROTECTED]> wrote: > >> >> > Hi all > >> >> > > >> >> > I am testing the performance of my Hadoop clsuters with Hadoop > Default > >> >> FIFO > >> >> > schedular. But I find a interesting phenomina. > >> >> > > >> >> > When I submit a series of jobs, some job will be executed earlier > even > >> >> they > >> >> > are submitted late. All jobs are request same amount of blocks. For > >> >> example: > >> >> > job 1 submit at time 0 > >> >> > job 2 submit at time 1 > >> >> > job 3 submit at time 2 > >> >> > job 4 submit at time 3 > >> >> > > >> >> > > >> >> > job 4 's queue time is smaller than job3's queue time. This disobey > >> the > >> >> FIFO > >> >> > principle. Any one can give a hint? > >> >> > > >> >> > Thanks > >> >> > > >> >> > Chen > >> >> > > >> >> > >> > > >> > > >
|
|