Lets assume that there are two jobs J1 (100 map tasks) and J2 (200 map
tasks) and the cluster has a capacity of 150 map tasks (15 nodes with 10 map
tasks per node) and Hadoop is using the default FIFO scheduler. If I submit
first J1 and then J2, will the jobs run in parallel or the job J1 has to be
completed before the job J2 starts.
I was reading 'Hadoop - The Definitive Guide' and it says "Early versions
of Hadoop had a very simple approach to scheduling users’ jobs: they ran in
order of submission, using a FIFO scheduler. Typically, each job would use
the whole cluster, so jobs had to wait their turn."