Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Automatically mapping a job submitted by a particular user to a specific hadoop map-reduce queue


Copy link to this message
-
Re: Automatically mapping a job submitted by a particular user to a specific hadoop map-reduce queue
The 'standard' way to do this is using queu-acls to enforce a particular user to be able to submit jobs to a sub-set of queues and then let the user decide which of that subset of queues he wishes to submit a job to.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Apr 24, 2013, at 6:22 PM, Sagar Mehta wrote:

> Hi Guys,
>
> We have a general purpose Hive cluster [about 200 nodes] which is used for various jobs like
> Production
> Experimental/Research
> Adhoc queries
> We are using the fair-share scheduler to schedule them and for this we have corresponding 3 pools in the scheduler.
>
> Here is what we want.
>
> A hive query submitted by a user with user-name A should go to one of the pools above based on a pre-defined mapping. We are wondering where/how to specify this mapping?
>
> We can do this manually by adding -Dmapred.job.queue.name="X" on a particular job run.
>
> This puts the job on the map-reduce queue named "X" and the following configuration in the fair-share scheduler
>
>   <property>
>     <name>mapred.fairscheduler.poolnameproperty</name>
>     <value>mapred.job.queue.name</value>
>   </property>
>
> maps this to a pool named "X" in the fair-share scheduler.
>
> However we [while wearing our Hadoop developer/admin hat] don't want the user/analyst to specify that so as to enforce some cluster-use policy.
>
> Based on his/her username we want to automatically select which hadoop queue and subsequently which fair-share scheduler pool, his/her job should go to. I'm pretty sure this is a common use-case and wondering how to do this in Hadoop.
>
> Any help/insights/pointers would be greatly appreciated.
>
> Sagar
> PS - Btw we are using Cloudera cdh3u2 and the user jobs are Hive queries.
>
>
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB