Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Capacity Scheduler questions

Copy link to this message
Capacity Scheduler questions
We are evaluating the Capacity Scheduler…

We would like to configure the equivalent of Fair Scheduler
userMaxJobsDefault = 1 (i.e. we would like to limit a user to a single job
in the cluster).

·         By default the Capacity Scheduler allows multiple jobs from a
single user to run concurrently.

·         From
appear to be limits for “the number of accepted/active jobs per user”.
However, the example capacity-scheduler.xml only has limits for active
tasks e.g. <queue>.maximum-initialized-active-tasks-per-user property.

·         Also the source CapacitySchedulerConf.java includes the following
code which suggests that the maximum jobs per user can be configured via
the init-accept-jobs-factor property. However, this is not clear from the
description of this property.

*  public int getInitToAcceptJobsFactor(String queue) {*

*    int initToAccepFactor =*

*      rmConf.getInt(toFullPropertyName(queue, "init-accept-jobs-factor"),*

*          defaultInitToAcceptJobsFactor);*

*    if(initToAccepFactor <= 0) {*

*      throw new IllegalArgumentException(*

*          "Invalid maximum jobs per user configuration " +

*    }*

*    return initToAccepFactor;*

*  }*

·         Also, other posts and sample xml files on the web refer to
property. However, I’ve tried setting this to 1 but it has no impact.

So… how can we configure the Capacity Scheduler to limit a user to a single
job in the cluster?


Also, I’m curious… a benefit of the Capacity Scheduler is that resource
limits can be specified in percentage terms, so if the cluster size changed
the CS configuration would not have to change. Therefore, why are some
properties specified in terms of tasks e.g.
which would need to be reconfigured if the cluster size changed?