Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Why is my output directory owned by yarn?


Copy link to this message
-
Re: Why is my output directory owned by yarn?
In insecure mode the containers run as the daemon's owner, i.e.
"yarn". Since the LocalFileSystem implementation has no way to
impersonate any users (we don't run as root/etc.) it can create files
only as the "yarn" user. On HDFS, we can send the right username in as
a form of authentication, and its reflected on the created files.

If you enable the LinuxContainerExecutor (or generally enable
security) then the containers run after being setuid'd to the
submitting user, and your files would appear with the right owner.

On Wed, Oct 30, 2013 at 1:49 AM, Bill Sparks <[EMAIL PROTECTED]> wrote:
>
> I have a strange use case and I'm looking for some debugging help.
>
>
> Use Case:
>
> If I run the hadoop mapped example wordcount program and write the output
> to HDFS, the output directory has the correct ownership.
>
> E.g.
>
> hadoop jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
> wordcount /user/jdoe/simple/HF.txt /users/jdoe/simple/outtest1
>
> hdfs dfs -ls simple
> Found 3 items
> drwxr-xr-x - jdoe supergroup 0 2013-10-25 21:26 simple/HF.out
> -rw-r--r-- 1 jdoe supergroup 610157 2013-10-25 21:21 simple/HF.txt
> drwxr-xr-x - jdoe supergroup 0 2013-10-29 14:50 simple/outtest1
>
> Where as if I write to a global filesystem my output directory is owned by
> yarn
>
>
> E.g.
>
> hadoop jar
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
> wordcount /user/jdoe/simple/HF.txt file:///scratch/jdoe/outtest1
> ls -l /scratch/jdoe
> total 8
> drwxr-xr-x 2 root root 4096 Oct 28 23:26 logs
> drwxr-xr-x 2 yarn yarn 4096 Oct 28 23:23 outtest1
>
>
>
> I've looked at the container log files, and saw no errors. The only thing
> I can think of, is the user authentication mode is "files:ldap" and the
> nodemanager nodes do not have access to the corporate LDAP server so it's
> working of local /etc/shadow which does not have my credentials - so it
> might just default to "yarn".
>
> I did find the following warning:
>
> 2013-10-29 14:58:52,184 INFO
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=jdoe
> OPERATION=Container Finished -
> Succeeded       TARGET=ContainerImpl    RESULT=SUCCESS  APPID=application_13830201365
> 44_0005 CONTAINERID=container_1383020136544_0005_01_000001
> ...
> 2013-10-29 14:58:53,062 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManager
> Impl: Trying to stop unknown container
> container_1383020136544_0005_01_000001
> 2013-10-29 14:58:53,062 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger:
> USER=UnknownUser        IP=10.128.0.17  OPERATION=Stop Container
> Request TARGET=ContainerManagerImpl     RESULT=FAILURE  DESCRIPTION=Trying to
> stop unknown
> container!      APPID=application_1383020136544_0005    CONTAINERID=container_13830
> 20136544_0005_01_000001
>
>
>
>
> Thanks,
>    John
>

--
Harsh J