Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Hadoop schedulers!


Copy link to this message
-
Re: Hadoop schedulers!
Rahul Bhattacharjee 2013-05-14, 02:14
Thanks a lot for the replies , it was really helpful.
On Tue, May 14, 2013 at 1:02 AM, Alok Kumar <[EMAIL PROTECTED]> wrote:

> Hi,
>
> As the name suggest, Fair-scheduler does a fair allocation of slot to the
> jobs.
> Let say, you have 10 map slots in your cluster and it is occupied by a
> job-1 which requires 30 map slot to finish. But the same time, another
> job-2 require only 2 map slots to finish - Here slots will be provided to
> job-2 to get finished quickly while job-1 will be keep running.
>
>
>
> On Tue, May 14, 2013 at 12:02 AM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
>> Any pointer to my question.
>>
>> There is another question , kind-of dumb , but just wanted to clarify.
>>
>> Say in a FIFO scheduler or a capacity scheduler , if there are slots
>> available and the first job doesn't need all of the available slots , then
>> the job next in the queue is scheduled for execution or that still waits
>> for the first job to finish?
>>
>
> - Jobs don't wait for all the slots to get freed. Execution will start as
> soon as it get a slot. However, Hadoop does its best to allot a slot where
> job can achieve data locality.
>
>
>
>>  Thanks,
>> Rahul
>>
>>
>> On Sat, May 11, 2013 at 8:31 PM, Rahul Bhattacharjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I was going through the job schedulers of Hadoop and could not see any
>>> major operational difference between the capacity scheduler and the fair
>>> share scheduler apart from the fact that fair share scheduler supports
>>> preemption and capacity scheduler doesn't.
>>>
>>> Another thing is the former creates logical pools based on certain
>>> attribute like username , user group etc and the later has a notion of job
>>> queues. Can someone point me to any other major differences between these
>>> two types of schedulers.
>>>
>>> Another question in this regard is the capacity scheduler uses a FIFO
>>> queue.So its still possible that a high priority long running job using all
>>> the capacity allocated to the queue to block all the other jobs after it in
>>> the queue.I think this is the expected behavior , but wanted to confirm.
>>>
>>> Thanks,
>>> Rahul
>>>
>>>
>>>
>>
>
> Thanks
> --
> Alok
>