-What is the reason that DataNode should configured the http and streamer port with a number under 1024 in secure mode?
Now our cluster is moving to security mode. We find many difference with the non-security, one is the starting of datanode. And I am not sure how it works, so I send the email here to ask.
Secure mode must use jsvc liking tools to start datanode because it allows the datanode listening the port under 1024 woring in not-root user.
I search the reason of using the port under 1024 with google, only findng that Cloudera's CDH doc describes the reason, which is "DataNode must be below 1024, because this provides part of the security mechanism to make it impossible for a user to run a map task which impersonates a DataNode."
I try to configure the datanode's http port with 2004(the suggesting value is 1004) which is above 1024, then starting it in secure mode. It result in a failure of starting the one as expected. But I found the failure is because of the DataNode itself check the number and throws the exception. Since user to run map task may impersonate the DataNode, he could also change the code of DataNode with avoiding the check in DataNode. When user do it, it still impersonate the DataNode with a port above 1024, which a non-root user could use and then application in map task could use.
Then I supposed that NN should also do the check, so I deleted the check code in DataNode, configuring the http port with 2004, then starting DataNode in secure mode. The DataNode starting successfully and the NN accept the DataNode.
The data is also writed to the DataNode. Everything works well as the DataNode is a normal one.
Is it a defect? Or I 've missed something. If either of them, please let me know. Thank you.
Aaron T. Myers 2012-11-16, 19:41
Xiaohan 2012-11-19, 04:28
Todd Lipcon 2012-11-19, 18:01