Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - RE: issue with permissions of mapred.system.dir


Copy link to this message
-
RE: issue with permissions of mapred.system.dir
Kartashov, Andy 2012-10-10, 13:34
Have you created a sub-dir under user/ as user/robing for user robing?

Depending on your version of hadoop it is import to set up your directory structure users/groups properly.

Here is just an example:
drwxrwxrwt   - hdfs supergroup          0 2012-04-19 15:14 /tmp
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib/hadoop-hdfs
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib/hadoop-hdfs/cache
drwxr-xr-x   - mapred   supergroup          0 2012-04-19 15:19 /var/lib/hadoop-hdfs/cache/mapred
drwxr-xr-x   - mapred   supergroup          0 2012-04-19 15:29 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt   - mapred   supergroup          0 2012-04-19 15:33 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

user directory /user/joe
-rw-r--r--   - joe supergroup              0 2012-02-13 12:21 /input/core-site.xml
Andy Kartashov
MPAC
Architecture R&D, Co-op
1340 Pickering Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>

From: Goldstone, Robin J. [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, October 09, 2012 7:45 PM
To: [EMAIL PROTECTED]
Subject: issue with permissions of mapred.system.dir

I am bringing up a Hadoop cluster for the first time (but am an experienced sysadmin with lots of cluster experience) and running into an issue with permissions on mapred.system.dir.   It has generally been a chore to figure out all the various directories that need to be created to get Hadoop working, some on the local FS, others within HDFS, getting the right ownership and permissions, etc..  I think I am mostly there but can't seem to get past my current issue with mapred.system.dir.

Some general info first:
OS: RHEL6
Hadoop version: hadoop-1.0.3-1.x86_64

20 node cluster configured as follows
1 node as primary namenode
1 node as secondary namenode + job tracker
18 nodes as datanode + tasktracker

I have HDFS up and running and have the following in mapred-site.xml:
<property>
  <name>mapred.system.dir</name>
  <value>hdfs://hadoop1/mapred</value>
  <description>Shared data for JT - this must be in HDFS</description>
</property>

I have created this directory in HDFS, owner mapred:hadoop, permissions 700 which seems to be the most common recommendation amongst multiple, often conflicting articles about how to set up Hadoop.  Here is the top level of my filesystem:
hyperion-hdp4@hdfs:hadoop fs -ls /
Found 3 items
drwx------   - mapred hadoop          0 2012-10-09 12:58 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user

Note, it doesn't seem to really matter what permissions I set on /mapred since when the Jobtracker starts up it changes them to 700.

However, when I try to run the hadoop example teragen program as a "regular" user I am getting this error:
hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen -D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
Generating 10000000000 using 2 maps with step of 5000000000
12/10/09 16:27:02 INFO mapred.JobClient: Running job: job_201210072045_0003
12/10/09 16:27:03 INFO mapred.JobClient:  map 0% reduce 0%
12/10/09 16:27:03 INFO mapred.JobClient: Job complete: job_201210072045_0003
12/10/09 16:27:03 INFO mapred.JobClient: Counters: 0
12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job initialization failed:
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=robing, access=EXECUTE, inode="mapred":mapred:hadoop:rwx------
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3251)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207)
at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
<rest of stack trace omitted>

This seems to be saying that is trying to write to the HDFS /mapred filesystem as me (robing) rather than as mapred, the username under which the jobtracker and tasktracker run.

To verify this is what is happening, I manually changed the permissions on /mapred from 700 to 755 since it claims to want execute access:
hyperion-hdp4@mapred:hadoop fs -chmod 755 /mapred
hyperion-hdp4@mapred:hadoop fs -ls /
Found 3 items
drwxr-xr-x   - mapred hadoop          0 2012-10-09 12:58 /mapred
drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
drwxr-xr-x   - hd