-Re: What is the reason that DataNode should configured the http and streamer port with a number under 1024 in secure mode?
Aaron T. Myers 2012-11-16, 19:41
You're right that if a malicious user can get a fake DN to register with
the NN then the low ports don't matter. However, in order for a user to get
a malicious DN to register with the NN that user will need access to the DN
keytab. If the user has access to that, then all is lost anyway since by
logging in with that keytab the user can act as an HDFS super user.
The purpose of the requiring the low ports is to prevent the following
1) Malicious user finds a crashed DN process or somehow causes a DN process
2) Before the NN considers that DN dead (by default 10 minutes) the
malicious user starts a fake DN process on the same (high) ports.
3) The NN continues to tell clients that it's OK to write to that DN that
has just crashed for 10 minutes.
4) The malicious user steals all the data written to the crashed process in
those 10 minutes.
I hope this clears things up.
Aaron T. Myers
Software Engineer, Cloudera
On Fri, Nov 16, 2012 at 6:44 AM, Xiaohan <[EMAIL PROTECTED]> wrote:
> Hi, guys.
> Now our cluster is moving to security mode. We find many difference with
> the non-security, one is the starting of datanode. And I am not sure how it
> works, so I send the email here to ask.
> Secure mode must use jsvc liking tools to start datanode because it allows
> the datanode listening the port under 1024 woring in not-root user.
> I search the reason of using the port under 1024 with google, only findng
> that Cloudera's CDH doc describes the reason, which is "DataNode must be
> below 1024, because this provides part of the security mechanism to make it
> impossible for a user to run a map task which impersonates a DataNode."
> I try to configure the datanode's http port with 2004(the suggesting value
> is 1004) which is above 1024, then starting it in secure mode. It result in
> a failure of starting the one as expected. But I found the failure is
> because of the DataNode itself check the number and throws the exception.
> Since user to run map task may impersonate the DataNode, he could also
> change the code of DataNode with avoiding the check in DataNode. When user
> do it, it still impersonate the DataNode with a port above 1024, which a
> non-root user could use and then application in map task could use.
> Then I supposed that NN should also do the check, so I deleted the check
> code in DataNode, configuring the http port with 2004, then starting
> DataNode in secure mode. The DataNode starting successfully and the NN
> accept the DataNode.
> The data is also writed to the DataNode. Everything works well as the
> DataNode is a normal one.
> Is it a defect? Or I 've missed something. If either of them, please let
> me know. Thank you.