Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Hadoop schedulers!


Copy link to this message
-
Re: Hadoop schedulers!
Thanks a lot for the replies , it was really helpful.
On Tue, May 14, 2013 at 1:02 AM, Alok Kumar <[EMAIL PROTECTED]> wrote:

> Hi,
>
> As the name suggest, Fair-scheduler does a fair allocation of slot to the
> jobs.
> Let say, you have 10 map slots in your cluster and it is occupied by a
> job-1 which requires 30 map slot to finish. But the same time, another
> job-2 require only 2 map slots to finish - Here slots will be provided to
> job-2 to get finished quickly while job-1 will be keep running.
>
>
>
> On Tue, May 14, 2013 at 12:02 AM, Rahul Bhattacharjee <
> [EMAIL PROTECTED]> wrote:
>
>> Any pointer to my question.
>>
>> There is another question , kind-of dumb , but just wanted to clarify.
>>
>> Say in a FIFO scheduler or a capacity scheduler , if there are slots
>> available and the first job doesn't need all of the available slots , then
>> the job next in the queue is scheduled for execution or that still waits
>> for the first job to finish?
>>
>
> - Jobs don't wait for all the slots to get freed. Execution will start as
> soon as it get a slot. However, Hadoop does its best to allot a slot where
> job can achieve data locality.
>
>
>
>>  Thanks,
>> Rahul
>>
>>
>> On Sat, May 11, 2013 at 8:31 PM, Rahul Bhattacharjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I was going through the job schedulers of Hadoop and could not see any
>>> major operational difference between the capacity scheduler and the fair
>>> share scheduler apart from the fact that fair share scheduler supports
>>> preemption and capacity scheduler doesn't.
>>>
>>> Another thing is the former creates logical pools based on certain
>>> attribute like username , user group etc and the later has a notion of job
>>> queues. Can someone point me to any other major differences between these
>>> two types of schedulers.
>>>
>>> Another question in this regard is the capacity scheduler uses a FIFO
>>> queue.So its still possible that a high priority long running job using all
>>> the capacity allocated to the queue to block all the other jobs after it in
>>> the queue.I think this is the expected behavior , but wanted to confirm.
>>>
>>> Thanks,
>>> Rahul
>>>
>>>
>>>
>>
>
> Thanks
> --
> Alok
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB