-Re: Is there a reason for the Hive remote metastore to execute commands as different users?
Ashutosh Chauhan 2011-11-30, 07:02
This indeed is a bug. I have done a patch for it on
https://issues.apache.org/jira/browse/HIVE-2616 Would you like to try it
out to see if that works for you?
On Tue, Nov 29, 2011 at 02:45, Alex Holmes <[EMAIL PROTECTED]> wrote:
> Running mysql as the metastore doesn't change the behavior of the HDFS
> operations, and more importantly who (the ugi) they are executed as.
> Does anyone have any thoughts as to why Hive HDFS operations are run
> as different users?
> Many thoughts,
> On Tue, Nov 29, 2011 at 2:47 AM, Alexander C.H. Lorenz
> <[EMAIL PROTECTED]> wrote:
> > Derby depends on a local filestore, for more flexibility and security I
> > suggest mysql as a metastore.
> > - Alex
> > On Tue, Nov 29, 2011 at 3:06 AM, Alex Holmes <[EMAIL PROTECTED]>
> >> Hi,
> >> I'm running Hive 0.7.1 with a remote metastore (Derby) on Hadoop 0.20.2.
> >> Is there a reason that CREATE and DROP commands when translated into
> >> HDFS operations are run as the remote Hive metastore user, but a LOAD
> >> is translated into HDFS operations that are executed as the Hive
> >> client user? If my understanding is correct, doesn't this mean that:
> >> 1. The Hive remote metastore must always be run as a superuser, which
> >> is arguably a security risk. If I run the Hive remote metastore as a
> >> non-superuser different from the Hive client user, then a LOAD DATA
> >> LOCAL (with the HDFS umask default of 022) creates a directory chmod'd
> >> 755, which doesn't give the Hive metastore user permissions to remove
> >> the directory in a subsequent DROP.
> >> 2. The Hive client must have write permissions on the initial table
> >> directory created by the CREATE command executed as the Hive remove
> >> metastore user. This would only work in cases where both the remote
> >> Hive metastore user and the client Hive user were the same user, or if
> >> the Hive client were a superuser. In my own testing the only way I
> >> could get this to work when they were different users (and not
> >> superusers) was in the application of a locally written patch which
> >> addresses HIVE-2504.
> >> Maybe I'm over-simplifying, but couldn't all the Hive remote metastore
> >> HDFS operations be run as the Hive client's user/group?
> >> Thanks,
> >> Alex
> > --
> > Alexander Lorenz
> > http://mapredit.blogspot.com
> > P Think of the environment: please don't print this email unless you
> > need to.