Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Hadoop Security - TaskTracker and Active Directory


+
bigbibguy father 2011-10-01, 02:19
+
Devaraj Das 2011-10-01, 15:14
+
bigbibguy father 2011-10-01, 16:36
Copy link to this message
-
Re: Hadoop Security - TaskTracker and Active Directory
Devaraj Das 2011-10-03, 20:44
Doing everything in the Active Directory should work as well.. What I said earlier was more from the Yahoo deployment of security. Let us know how it goes.

On Oct 1, 2011, at 9:36 AM, bigbibguy father wrote:

> Thanks Devaraj for responding.
>
> In our case , the LDAP server is the corporate active directory server, which has the user id and the attributes.
>
> Cluster nodes contact KDC for getting TGT and service tickets for NN and JT and keep them until the expiry time (7 days). Cluster nodes contact LDAP Server for each task. So if I understand correctly, the LDAP traffic from the cluster nodes (around 1000)  will be much more than the Authentication traffic from cluster nodes.  
>
> Why not use the Active Directory as the KDC for authenticating the service principals (cluster nodes)  also?
>
> In this way , we do not have to manage a separate KDC and worry about it's availability and health.
>  
> We also plan to have one Active Directory server at the same datacenter as the cluster , but outside the cluster firewall so that LDAP queries have a higher SLA.
>
> The benefits associated with the local KDC option are below  and my analysis is added for each of the benefit.
>
> It requires less configuration with Active Directory.  - But cluster nodes need to talk to Active Directory for the user details. So it anyway needs the configuration with Active Directory
> It is comparatively easy to script the creation of many principals and keytabs. A principal and keytab must be created for every daemon in the cluster, and in a large cluster this can be extremely onerous to do directly in Active Directory.  - This is a one time job and we may be able to script this with AD also.
> There is no need to involve central Active Directory administrators in order to get service principals created. - We get to manage the OU containing the service principals.
> It allows for incremental configuration. The Hadoop administrator can completely configure and verify the functionality the cluster independently of integrating with Active Directory - Good to have this benefit and this is not available in the Active Directory only option
> It can serve to shield the corporate Active Directory server(s) from the many machines in a Hadoop cluster all requesting Kerberos tickets simultaneously. During cluster start-up, Hadoop will effectively be acting as a distributed denial of service attack on the central Active Directory server, which could adversely affect the performance of the Active Directory server. - The service principal authentication traffic is not that frequent and hence these spikes should not be much of a problem for our highly available Active Directory.
>
>
>       But the drawback for local KDC option is that we need to maintain that KDC server and make sure its highly available with backup server.
>
>
>
> Thanks and Regards,
> BBG
>
>
>
>
> On Sat, Oct 1, 2011 at 8:14 AM, Devaraj Das <[EMAIL PROTECTED]> wrote:
> The Cluster KDC should be set up to trust the Active Directory KDC (cross-realm trust in the kerberos lingo). This handles the cases of user authentication when a user talks to a server in the cluster directly (e.g., user->namenode).
> The GID and other user attributes are usually stored in ldap. The cluster nodes are set up to talk to the cluster specific ldap server.
>
> On Sep 30, 2011, at 7:19 PM, bigbibguy father wrote:
>
>> We are planning to enable secure Hadoop using Kerberos.
>>
>> Our users reside in the active directory. We read that there are two options  to use Kerberos for securing Hadoop.
>>
>> 1) You run Kerberos on machine local to the cluster and create service principals here
>> 2) Use Active Directory itself as the kerberos KDC and create service principals also in Active Directory.
>>
>> It seems cloudera and industry in general recommends option1 of running a local KDC for authernticating service principals.
>> https://ccp.cloudera.com/display/CDHDOC/Integrating+Hadoop+Security+with+Active+Directory