Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> hadoop security API (repost)


Copy link to this message
-
Re: hadoop security API (repost)
Tony,

If you are doing a server app that interacts with the cluster on
behalf of different users (like Ooize, as you mentioned in your
email), then you should use the proxyuser capabilities of Hadoop.

* Configure user MYSERVERUSER as proxyuser in Hadoop core-site.xml
(this requires 2 properties settings, HOSTS and GROUPS).
* Run your server app as MYSERVERUSER and have a Kerberos principal
MYSERVERUSER/MYSERVERHOST
* Initialize your server app loading the MYSERVERUSER/MYSERVERHOST keytab
* Use the UGI.doAs() to create JobClient/Filesystem instances using
the user you want to do something on behalf
* Keep in mind that all the users you need to do something on behalf
should be valid Unix users in the cluster
* If those users need direct access to the cluster, they'll have to be
also defined in in the KDC user database.

Hope this helps.

Thx

On Mon, Jul 2, 2012 at 6:22 AM, Tony Dean <[EMAIL PROTECTED]> wrote:
> Yes, but this will not work in a multi-tenant environment.  I need to be able to create a Kerberos TGT per execution thread.
>
> I was hoping through JAAS that I could inject the name of the current principal and authenticate against it.  I'm sure there is a best practice for hadoop/hbase client API authentication, just not sure what it is.
>
> Thank you for your comment.  The solution may well be associated with the UserGroupInformation class.  Hopefully, other ideas will come from this thread.
>
> Thanks.
>
> -Tony
>
> -----Original Message-----
> From: Ivan Frain [mailto:[EMAIL PROTECTED]]
> Sent: Monday, July 02, 2012 8:14 AM
> To: [EMAIL PROTECTED]
> Subject: Re: hadoop security API (repost)
>
> Hi Tony,
>
> I am currently working on this to access HDFS securely and programmaticaly.
> What I have found so far may help even if I am not 100% sure this is the right way to proceed.
>
> If you have already obtained a TGT from the kinit command, hadoop library will locate it "automatically" if the name of the ticket cache corresponds to default location. On Linux it is located /tmp/krb5cc_uid-number.
>
> For example, with my linux user hdfs, I get a TGT for hadoop user 'ivan'
> meaning you can impersonate ivan from hdfs linux user:
> ------------------------------------------
> hdfs@mitkdc:~$ klist
> Ticket cache: FILE:/tmp/krb5cc_10003
> Default principal: [EMAIL PROTECTED]
>
> Valid starting    Expires           Service principal
> 02/07/2012 13:59  02/07/2012 23:59  krbtgt/[EMAIL PROTECTED] renew until 03/07/2012 13:59
> -------------------------------------------
>
> Then, you just have to set the right security options in your hadoop client in java and the identity will be [EMAIL PROTECTED] for our example. In my tests, I only use HDFS and here a snippet of code to have access to a secure hdfs cluster assuming the previous TGT (ivan's impersonation):
>
> --------------------------------------------
>      val conf: HdfsConfiguration = new HdfsConfiguration()
>      conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
> "kerberos")
>      conf.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
> "true")
>      conf.set(DFSConfigKeys.DFS_NAMENODE_USER_NAME_KEY, serverPrincipal)
>
>      UserGroupInformation.setConfiguration(conf)
>
>      val fs = FileSystem.get(new URI(hdfsUri), conf)
> --------------------------------------------
>
> Using this 'fs' is a handler to access hdfs securely as user 'ivan' even if ivan does not appear in the hadoop client code.
>
> Anyway, I also see two other options:
>   * Setting the KRB5CCNAME environment variable to point to the right ticketCache file
>   * Specifying the keytab file you want to use from the UserGroupInformation singleton API:
> UserGroupInformation.loginUserFromKeytab(user, keytabFile)
>
> If you want to understand the auth process and the different options to login, I guess you need to have a look to the UserGroupInformation.java source code (release 0.23.1 link: http://bit.ly/NVzBKL). The private class HadoopConfiguration line 347 is of major interest in our case.

Alejandro
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB