Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - HADOOP_CLIENT_OPTS getting set multiple times (is this a bug?)


Copy link to this message
-
HADOOP_CLIENT_OPTS getting set multiple times (is this a bug?)
Tom Brown 2013-03-12, 19:50
I am using Hadoop 1.0.2 (the stock .deb, compiled by HortonWorks AFAIK).

I noticed that my task tracker processes have multiple "-Xmx" configs
attached, and that the later ones (128m) were overriding the ones I
had intended to be used (500m).

After digging through the various scripts, I found that the problem is
happening because "hadoop-env.sh" is getting invoked multiple times.
The deb file created a link from "/etc/profile.d/" to hadoop-env.sh,
so this file is run whenever I log in. The "hadoop" script also
invokes hadoop-env.sh (via "hadoop-config.sh"). The following sequence
is causing the problem:

1. The first time hadoop-env.sh is invoked (when the user logs in),
HADOOP_CLIENT_OPTS is set to "-Xmx128m ...".

2. The second time hadoop-env.sh is invoked (when a Hadoop process is
started), HADOOP_OPTS is set to "... $HADOOP_CLIENT_OPTS" (thereby
including the memory setting for all Hadoop processes in general)

3. Also during the second execution, HADOOP_CLIENT_OPTS is recursively
set to "-Xmx128m $HADOOP_CLIENT_OPTS" (so it now contains "-Xmx128m
-Xmx128m").

4. When the actual hadoop process is started, it always includes both
JAVA_HEAP_SIZE and HADOOP_OPTS (in that order), but since HADOOP_OPTS
also has a memory setting and is later in the command line, it takes
precedence.

I couldn't find any bug that matched this, so I thought I'd reach out
to the community: Is this a known bug? Do the scripts and deb file
belong to Hadoop in general, or is this the responsibility of a
specific distribution?

Thanks in advance!

--Tom