Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Problem With NAT ips


Copy link to this message
-
Re: Problem With NAT ips
Hi Mauro,

The registration process has changed quite a bit.  I don't think the NN "trusts" the DN's self-identification anymore.  Otherwise it makes it trivial to spoof another DN, intentionally or not, which can be a security hazard.

I suspect the NN can't resolve the DN.  Unresolvable hosts are rejected because the allow/deny lists may contain hostnames.  If dns is temporarily unavailable, you don't want a node blocked by hostname to slip through.    Try adding the DN's public ip 10.70.5.57 to the NN's /etc/hosts if it's not resolvable via dns.

I hope this helps!

Daryn

On Apr 10, 2013, at 4:32 PM, Mauro Cohen wrote:

Hello, i have a problem with the new version of hadoop.

I have cluster with 2 nodes.
Each one has a private ip and a public IP configured through NAT.
The problem is that the private IP of each node doesnt belong to the same net. (I have no conectivity between nodes through that ip)
I have conectvity between nodes thorugh the NAT ip only, (ssh, ping, etc ).

With the hadoop 0.20.x version when i configured datanodes and namenodes configuration files i allways used the host-name for propertys (ex: fs.defaul.name<http://fs.defaul.name/> property)  and never have problems with this.
But with the new version of hadoop, theres has to be change the way that nodes comunicates itself, and they use the private ip in some point instead of host-names.

I have installed a cluster with 2 nodes:

hadoop-2-00 is the namenode.
In hadoop-2-00 i have this /etc/hosts file and this ifconfig output:

etc/hosts:

172.16.67.68 hadoop-2-00

ifconfig:

eth0      Link encap:Ethernet  HWaddr fa:16:3e:4c:06:25
          inet addr:172.16.67.68  Bcast:172.16.95.255  Mask:255.255.224.0
          inet6 addr: fe80::f816:3eff:fe4c:625/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:73475 errors:0 dropped:0 overruns:0 frame:0
          TX packets:58912 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:100923399 (100.9 MB)  TX bytes:101169918 (101.1 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:10 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:588 (588.0 B)  TX bytes:588 (588.0 B)

The NAT ip for this node is 10.70.5.51

I use the host-name(hadoop-2-00) in all the configuration files of hadoop.

The other node is the datanode hadoop-2-01 and has this etc/hosts and ifconfig output:

eth0      Link encap:Ethernet  HWaddr fa:16:3e:70:5e:bd
          inet addr:172.16.67.69  Bcast:172.16.95.255  Mask:255.255.224.0
          inet6 addr: fe80::f816:3eff:fe70:5ebd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27081 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24105 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:95842550 (95.8 MB)  TX bytes:4314694 (4.3 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:34 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1900 (1.9 KB)  TX bytes:1900 (1.9 KB)

/etc/hosts

172.16.67.69 hadoop-2-01

The nat ip for that host is 10.70.5.57

When i start the namenode there  is no problem.

But when i start the datanode i theres is an error.

This is the stacktrace of the datanode log:

2013-04-10 16:01:26,997 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/10.70.5.51:8020<http://10.70.5.51:8020/> beginning handshake with NN
2013-04-10 16:01:27,013 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/10.70.5.51:8020<http://10.70.5.51:8020/>
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException): Datanode denied communication with namenode: DatanodeRegistration(0.0.0.0, storageID=DS-1556234100-172.16.67.69-50010-1365621786288, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=CID-65f42cc4-6c02-4537-9fb8-627a612ec74e;nsid=1995699852;c=0)
        at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:629)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3459)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:881)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:90)
        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:18295)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729)

        at org.apache.hadoop.ipc.Client.