Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Site-specific dfs.client.local.interfaces setting not respected for Yarn MR container


Copy link to this message
-
Re: Site-specific dfs.client.local.interfaces setting not respected for Yarn MR container
Chris Mawata 2013-12-15, 20:57
You might have better luck with an alternative approach to avoid having
IPV6 which is to add to your hadoop-env.sh

HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true

Chris

On 12/14/2013 11:38 PM, Jeff Stuckman wrote:
>
> Hello,
>
> I have set up a two-node Hadoop cluster on Ubuntu 12.04 running
> streaming jobs with Hadoop 2.2.0. I am having problems with running
> tasks on a NM which is on a different host than the RM, and I believe
> that this is happening because the NM host's
> dfs.client.local.interfaces property is not having any effect.
>
> I have two hosts set up as follows:
>
> Host A (1.2.3.4):
>
> NameNode
>
> DataNode
>
> ResourceManager
>
> Job History Server
>
> Host B (5.6.7.8):
>
> DataNode
>
> NodeManager
>
> On each host, hdfs-site.xml was edited to change
> dfs.client.local.interfaces from an interface name ("eth0") to the
> IPv4 address representing that host's interface ("1.2.3.4" or
> "5.6.7.8"). This is to prevent the HDFS client from randomly binding
> to the IPv6 side of the interface (it randomly swaps between the IP4
> and IP6 addresses, due to the random bind IP selection in the DFS
> client) which was causing other problems.
>
> However, I am observing that the Yarn container on the NM appears to
> inherit the property from the copy of hdfs-site.xml on the RM, rather
> than reading it from the local configuration file. In other words,
> setting the dfs.client.local.interfaces property in Host A's
> configuration file causes the Yarn containers on Host B to use same
> value of the property. This causes the map task to fail, as the
> container cannot establish a TCP connection to the HDFS. However, on
> Host B, other commands that access the HDFS (such as "hadoop fs") do
> work, as they respect the local value of the property.
>
> To illustrate with an example, I start a streaming job from the
> command line on Host A:
>
> hadoop jar
> $HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar -input
> hdfs://hosta/linesin/ -output hdfs://hosta/linesout -mapper
> /home/hadoop/toRecords.pl -reducer /bin/cat
>
> The NodeManager on Host B notes that there was an error starting the
> container:
>
> 13/12/14 19:38:45 WARN nodemanager.DefaultContainerExecutor: Exception
> from container-launch with container ID:
> container_1387067177654_0002_01_000001 and exit code: 1
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>
>         at java.lang.Thread.run(Unknown Source)
>
> On Host B, I open
> userlogs/application_1387067177654_0002/container_1387067177654_0002_01_000001/syslog
> and find the following messages (note the DEBUG-level messages which I
> manually enabled for the DFS client):
>
> 2013-12-14 19:38:32,439 DEBUG [main] org.apache.hadoop.hdfs.DFSClient:
> Using local interfaces [1.2.3.4] with addresses [/1.2.3.4:0]
>
> <cut>
>
> 2013-12-14 19:38:33,085 DEBUG [main] org.apache.hadoop.hdfs.DFSClient:
> newInfo = LocatedBlocks{
>
>   fileLength=537
>
>   underConstruction=false
>
> blocks=[LocatedBlock{BP-1911846690-1.2.3.4-1386999495143:blk_1073742317_1493;
> getBlockSize()=537; corrupt=false; offset=0; locs=[5.6.7.8:50010,