Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Fair scheduler.


+
Patai Sangbutsarakum 2012-10-14, 00:33
+
Harsh J 2012-10-14, 03:30
+
Patai Sangbutsarakum 2012-10-15, 21:27
+
Harsh J 2012-10-15, 22:18
Copy link to this message
-
Re: Fair scheduler.
Hi Harsh,
Thanks for breaking it down clearly. I would say i am successful 98%
from the instruction.
The 2% is about hadoop.tmp.dir

let's say i have 2 users
userA is a user that start hdfs and mapred
userB is a regular user

if i use default value of  hadoop.tmp.dir
/tmp/hadoop-${user.name}
I can submit job as usersA but not by usersB
ser=userB, access=WRITE, inode="/tmp/hadoop-userA/mapred/staging"
:userA:supergroup:drwxr-xr-x

i googled around; someone recommended to change hadoop.tmp.dir to /tmp/hadoop.
This way it is almost a yay way; the thing is

if I submit as userA it will create /tmp/hadoop in local machine which
ownership will be userA.userA,
and once I tried to submit job from the same machine as userB I will
get  "Error creating temp dir in hadoop.tmp.dir /tmp/hadoop due to
Permission denied"
(as because /tmp/hadoop is own by userA.userA). vise versa if I delete
/tmp/hadoop and let the directory be created by userB, userA will not
be able to submit job.

Which is the right approach i should work with?
Please suggest

Patai
On Mon, Oct 15, 2012 at 3:18 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi Patai,
>
> Reply inline.
>
> On Tue, Oct 16, 2012 at 2:57 AM, Patai Sangbutsarakum
> <[EMAIL PROTECTED]> wrote:
>> Thanks for input,
>>
>> I am reading the document; i forget to mention that i am on cdh3u4.
>
> That version should have the support for all of this.
>
>>> If you point your poolname property to mapred.job.queue.name, then you
>>> can leverage the Per-Queue ACLs
>>
>> Is that mean if i plan to 3 pools of fair scheduler, i have to
>> configure 3 queues of capacity scheduler. in order to have each pool
>> can leverage Per-Queue ACL of each queue.?
>
> Queues are not hard-tied into CapacityScheduler. You can have generic
> queues in MR. And FairScheduler can bind its Pool concept into the
> Queue configuration.
>
> All you need to do is the following:
>
> 1. Map FairScheduler pool name to reuse queue names itself:
>
> mapred.fairscheduler.poolnameproperty set to 'mapred.job.queue.name'
>
> 2. Define your required queues:
>
> mapred.job.queues set to "default,foo,bar" for example, for 3 queues:
> default, foo and bar.
>
> 3. Define Submit ACLs for each Queue:
>
> mapred.queue.default.acl-submit-job set to "patai,foobar users,adm"
> (usernames groupnames)
>
> mapred.queue.foo.acl-submit-job set to "spam eggs"
>
> Likewise for remaining queues, as you need it…
>
> 4. Enable ACLs and restart JT.
>
> mapred.acls.enabled set to "true"
>
> 5. Users then use the right API to set queue names before submitting
> jobs, or use -Dmapred.job.queue.name=value via CLI (if using Tool):
> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobConf.html#setQueueName(java.lang.String)
>
> 6. Done.
>
> Let us know if this works!
>
> --
> Harsh J
+
Arpit Gupta 2012-10-16, 23:12
+
Patai Sangbutsarakum 2012-10-16, 23:52
+
Harsh J 2012-10-17, 07:00
+
Goldstone, Robin J. 2012-10-17, 15:09
+
Harsh J 2012-10-17, 15:43
+
Patai Sangbutsarakum 2012-10-17, 17:40
+
Harsh J 2012-10-17, 17:53
+
Luke Lu 2012-10-18, 10:11