Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> File Permissions on s3 FileSystem


Copy link to this message
-
Re: File Permissions on s3 FileSystem
El 23/10/12 13:32, Parth Savani escribi�:
> Hello Everyone,
>         I am trying to run a hadoop job with s3n as my filesystem.
> I changed the following properties in my hdfs-site.xml
>
> fs.default.name <http://fs.default.name>=s3n://KEY:VALUE@bucket/
A good practice to this is to use these two properties in the
core-site.xml, if you will use S3 often:
<property>
     <name>fs.s3.awsAccessKeyId</name>
     <value>AWS_ACCESS_KEY_ID</value>
</property>

<property>
     <name>fs.s3.awsSecretAccessKey</name>
     <value>AWS_SECRET_ACCESS_KEY</value>
</property>

After that, you can access to your URI with a more friendly way:
S3:
  s3://<s3-bucket>/<s3-filepath>

S3n:
  s3n://<s3-bucket>/<s3-filepath>

> mapreduce.jobtracker.staging.root.dir=s3n://KEY:VALUE@bucket/tmp
>
> When i run the job from ec2, I get the following error
>
> The ownership on the staging directory
> s3n://KEY:VALUE@bucket/tmp/ec2-user/.staging is not as expected. It is
> owned by   The directory must be owned by the submitter ec2-user or by
> ec2-user
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:481)
>
> I am using cloudera CDH4 hadoop distribution. The error is thrown from
> JobSubmissionFiles.java class
>  public static Path getStagingDir(JobClient client, Configuration conf)
>   throws IOException, InterruptedException {
>     Path stagingArea = client.getStagingAreaDir();
>     FileSystem fs = stagingArea.getFileSystem(conf);
>     String realUser;
>     String currentUser;
>     UserGroupInformation ugi = UserGroupInformation.getLoginUser();
>     realUser = ugi.getShortUserName();
>     currentUser =
> UserGroupInformation.getCurrentUser().getShortUserName();
>     if (fs.exists(stagingArea)) {
>       FileStatus fsStatus = fs.getFileStatus(stagingArea);
>       String owner = fsStatus.*getOwner();*
>       if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>          throw new IOException("*The ownership on the staging
> directory " +*
> *                      stagingArea + " is not as expected. " + *
> *                      "It is owned by " + owner + ". The directory
> must " +*
> *                      "be owned by the submitter " + currentUser + "
> or " +*
> *                      "by " + realUser*);
>       }
>       if (!fsStatus.getPermission().equals(JOB_DIR_PERMISSION)) {
>         LOG.info("Permissions on staging directory " + stagingArea + "
> are " +
>           "incorrect: " + fsStatus.getPermission() + ". Fixing
> permissions " +
>           "to correct value " + JOB_DIR_PERMISSION);
>         fs.setPermission(stagingArea, JOB_DIR_PERMISSION);
>       }
>     } else {
>       fs.mkdirs(stagingArea,
>           new FsPermission(JOB_DIR_PERMISSION));
>     }
>     return stagingArea;
>   }
>
>
> I think my job calls getOwner() which returns NULL since s3 does not
> have file permissions which results in the IO exception that i am
> getting.
Which what user are you launching the job in EC2?
>
> Any workaround for this? Any idea how i could you s3 as the filesystem
> with hadoop on distributed mode?

Look here:
http://wiki.apache.org/hadoop/AmazonS3

10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci