Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> hadoop security API (repost)

Tony Dean 2012-07-01, 17:46
Copy link to this message
Re: hadoop security API (repost)
Hi Tony,

I am currently working on this to access HDFS securely and programmaticaly.
What I have found so far may help even if I am not 100% sure this is the
right way to proceed.

If you have already obtained a TGT from the kinit command, hadoop library
will locate it "automatically" if the name of the ticket cache corresponds
to default location. On Linux it is located /tmp/krb5cc_uid-number.

For example, with my linux user hdfs, I get a TGT for hadoop user 'ivan'
meaning you can impersonate ivan from hdfs linux user:
hdfs@mitkdc:~$ klist
Ticket cache: FILE:/tmp/krb5cc_10003
Default principal: [EMAIL PROTECTED]

Valid starting    Expires           Service principal
02/07/2012 13:59  02/07/2012 23:59  krbtgt/[EMAIL PROTECTED]
renew until 03/07/2012 13:59

Then, you just have to set the right security options in your hadoop client
in java and the identity will be [EMAIL PROTECTED] for our example. In my
tests, I only use HDFS and here a snippet of code to have access to a
secure hdfs cluster assuming the previous TGT (ivan's impersonation):

     val conf: HdfsConfiguration = new HdfsConfiguration()
     conf.set(DFSConfigKeys.DFS_NAMENODE_USER_NAME_KEY, serverPrincipal)


     val fs = FileSystem.get(new URI(hdfsUri), conf)

Using this 'fs' is a handler to access hdfs securely as user 'ivan' even if
ivan does not appear in the hadoop client code.

Anyway, I also see two other options:
  * Setting the KRB5CCNAME environment variable to point to the right
ticketCache file
  * Specifying the keytab file you want to use from the
UserGroupInformation singleton API:
UserGroupInformation.loginUserFromKeytab(user, keytabFile)

If you want to understand the auth process and the different options to
login, I guess you need to have a look to the UserGroupInformation.java
source code (release 0.23.1 link: http://bit.ly/NVzBKL). The private class
HadoopConfiguration line 347 is of major interest in our case.

Another point is that I did not find any easy way to prompt the user for a
password at runtim using the actual hadoop API. It appears to be somehow
hardcoded in the UserGroupInformation singleton. I guess it could be nice
to have a new function to give to the UserGroupInformation an authenticated
'Subject' which could override all default configurations. If someone have
better ideas it could be nice to discuss on it as well.

2012/7/1 Tony Dean <[EMAIL PROTECTED]>

> Hi,
> The security documentation specifies how to test a secure cluster by using
> kinit and thus adding the Kerberos principal TGT to the ticket cache in
> which
> the hadoop client code uses to acquire service tickets for use in the
> cluster.
> What if I created an application that used the hadoop API to communicate
> with
> hdfs and/or mapred protocols, is there a programmatic way to inform hadoop
> to
> use a particular Kerberos principal name with a keytab that contains its
> password key?  I didn't see a way to integrate with JAAS KrbLoginModule.
> I was thinking that if I could inject a callbackHandler, I could pass the
> principal name and the KrbLoginModule already has options to specify
> keytab.
> Is this something that is possible?  Or is this just not the right way to
> do things?
> I read about impersonation where authentication is performed with a system
> user such
> as "oozie" and then it just impersonates other users so that permissions
> are based on
> the impersonated user instead of the system user.
> Please help me understand my options for executing hadoop tasks in a
> multi-tenant application.
> Thank you!
Ivan Frain
11, route de Grenade
31530 Saint-Paul-sur-Save
mobile: +33 (0)6 52 52 47 07
Tony Dean 2012-07-02, 13:22
Alejandro Abdelnur 2012-07-02, 15:40
Tony Dean 2012-07-02, 16:15
Alejandro Abdelnur 2012-07-02, 16:21
Andrew Purtell 2012-07-02, 20:46