Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - validating user IDs


Copy link to this message
-
validating user IDs
Colin McCabe 2012-06-11, 22:57
Hi all,

I recently pulled the latest source, and ran a full build.  The
command line was this:
mvn compile -Pnative

I was confronted with this:

[INFO] Requested user cmccabe has id 500, which is below the minimum
allowed 1000
[INFO] FAIL: test-container-executor
[INFO] ===============================================[INFO] 1 of 1 test failed
[INFO] Please report to [EMAIL PROTECTED]
[INFO] ===============================================[INFO] make[1]: *** [check-TESTS] Error 1
[INFO] make[1]: Leaving directory
`/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor'

Needless to say, it didn't do much to improve my mood.  I was even
less happy when I discovered that -DskipTests has no effect on native
tests (they always run.)  See HADOOP-8480.

Unfortunately, it seems like this problem is popping up more and more
in our native code.  It first appeared in test-task-controller (see
MAPREDUCE-2376) and then later in test-container-executor
(HADOOP-8499).  The basic problem seems to be the hardcoded assumption
that all user IDs below 1000 are system IDs.

It is true that there are configuration files that can be changed to
alter the minimum user ID, but unfortunately these configuration files
are not used by the unit tests.  So anyone developing on a platform
where the user IDs start at 500 is now a second-class citizen, unable
to run unit tests.  This includes anyone running Red Hat, MacOS,
Fedora, etc.

Personally, I can change my user ID.  It's a time-consuming process,
because I need to re-uid all files, but I can do it.  This luxury may
not be available to everyone, though-- developers who don't have root
on their machines, or are using a pre-assigned user ID to connect to
NFS come to mind.

It's true that we could hack around this with environment variables.
It might even be possible to have Maven set these environment
variables automatically from the current user ID.  However, the larger
question I have here is whether this UID validation scheme even makes
any sense.  I have a user named "nobody" whose user ID is 65534.
Surely I should not be able to run map-reduce jobs as this user?  Yet,
under the current system, I can do exactly that.  The root of the
problem seems to be that there is both a default minimum and a default
maximum for "automatic" user IDs.  This configuration seems to be
stored in /etc/login.defs.

On my system, it has:
SYSTEM_UID_MIN            100
SYSTEM_UID_MAX            499
UID_MIN                  500
UID_MAX                 60000

So that means that anything over 60000 (like nobody) is not considered
a valid user ID for regular users.
We could potentially read this file (at least on Linux) and get more
sensible defaults.

I am also curious if we could simply check whether the user we're
trying to run the job as has a valid login shell.  System users are
almost always set to have a login shell of /bin/false or
/sbin/nologin.

Thoughts?
Colin