Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Is there a reason for the Hive remote metastore to execute commands as different users?


Copy link to this message
-
Re: Is there a reason for the Hive remote metastore to execute commands as different users?
Hey Alex,

This indeed is a bug. I have done a patch for it on
https://issues.apache.org/jira/browse/HIVE-2616 Would you like to try it
out to see if that works for you?

Ashutosh
On Tue, Nov 29, 2011 at 02:45, Alex Holmes <[EMAIL PROTECTED]> wrote:

> Running mysql as the metastore doesn't change the behavior of the HDFS
> operations, and more importantly who (the ugi) they are executed as.
>
> Does anyone have any thoughts as to why Hive HDFS operations are run
> as different users?
>
> Many thoughts,
> Alex
>
>
> On Tue, Nov 29, 2011 at 2:47 AM, Alexander C.H. Lorenz
> <[EMAIL PROTECTED]> wrote:
> > Derby depends on a local filestore, for more flexibility and security I
> > suggest mysql as a metastore.
> > - Alex
> >
> > On Tue, Nov 29, 2011 at 3:06 AM, Alex Holmes <[EMAIL PROTECTED]>
> wrote:
> >>
> >> Hi,
> >>
> >> I'm running Hive 0.7.1 with a remote metastore (Derby) on Hadoop 0.20.2.
> >>
> >> Is there a reason that CREATE and DROP commands when translated into
> >> HDFS operations are run as the remote Hive metastore user, but a LOAD
> >> is translated into HDFS operations that are executed as the Hive
> >> client user?  If my understanding is correct, doesn't this mean that:
> >>
> >> 1.  The Hive remote metastore must always be run as a superuser, which
> >> is arguably a security risk.  If I run the Hive remote metastore as a
> >> non-superuser different from the Hive client user, then a LOAD DATA
> >> LOCAL (with the HDFS umask default of 022) creates a directory chmod'd
> >> 755, which doesn't give the Hive metastore user permissions to remove
> >> the directory in a subsequent DROP.
> >>
> >> 2.  The Hive client must have write permissions on the initial table
> >> directory created by the CREATE command executed as the Hive remove
> >> metastore user.  This would only work in cases where both the remote
> >> Hive metastore user and the client Hive user were the same user, or if
> >> the Hive client were a superuser.  In my own testing the only way I
> >> could get this to work when they were different users (and not
> >> superusers) was in the application of a locally written patch which
> >> addresses HIVE-2504.
> >>
> >> Maybe I'm over-simplifying, but couldn't all the Hive remote metastore
> >> HDFS operations be run as the Hive client's user/group?
> >>
> >> Thanks,
> >> Alex
> >
> >
> >
> > --
> > Alexander Lorenz
> > http://mapredit.blogspot.com
> > P Think of the environment: please don't print this email unless you
> really
> > need to.
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB