Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive Authorization and Views

John Omernik 2013-05-04, 13:31
John Omernik 2013-05-16, 20:42
Copy link to this message
Re: Hive Authorization and Views
The largest issue is that the RDBMS security model does not match with
hive. Hive/Hadoop has file permissions, RDMBS have column and sometimes row
level permissions.

When you physically have access to the underlying file (row level)
permissions are not enforceable. The only way to enforce this type of
security is to force users through a "turnstyle" that changes how hive
currently works.
On Thu, May 16, 2013 at 4:42 PM, John Omernik <[EMAIL PROTECTED]> wrote:

> I am curious on the thoughts of the community here, this seems like
> something many enterprises would drool over with Hive... I am not a coder
> so the level coding involved something like this is unknown.
> On Sat, May 4, 2013 at 8:31 AM, John Omernik <[EMAIL PROTECTED]> wrote:
>> We were doing some tests this past week with hive authorization, one of
>> our current use "challenges" is when we have an underlying, well managed
>> and partitioned table, and we want to allow access to certain columns in
>> that table.  Our first thoughts went to VIEWs as that's a common use case
>> with Relational Databases, (i.e. setup a view with only the columns you
>> want the user to access) and set the permissions appropriately.
>> In testing, and this is not surprising given the the "newness" of Hive
>> Authorization, a VIEW can not be created as to allow access to to a table
>> without granting access to the underlying table, defeating the idea of the
>> view as tool to manage that access.
>> So I wanted to put to the user group: I've done some JIRA searching and
>> didn't find anything (I will admit my JIRA search Foo is not stellar), but
>> is there an option that could be thrown together in Hive that would allow
>> that use case?  Perhaps a configuration setting that would allow views to
>> execute as a specific user (perhaps a global user, or perhaps a user
>> specified as view creation).  This could allow the "view" to have access to
>> underlying table, but since the view is created, and it couldn't be changed
>> by the user, and thus you could set view "read" permissions to your user or
>> group of users you want access.
>> I suppose this has challenges "i.e. can a user just create a view to
>> bypass table level restrictions? Perhaps if this model was taken, the
>> privilege for CREATING/MODIFYING views could be created and granted only to
>> a superuser of some sort.  I am really just walking through ideas here as
>> this is the one last stumbling blocks we have with Hive from an "Enterprise
>> ready" point of view. Heck, if done right, you could almost do data masking
>> at the view level. You have a column in your source data that is sensitive,
>> so instead of returning that column you do a MD5 (can we have a native MD5
>> function? :) of that column or you blank that column. If we put in strong
>> security on the creation, modification of views, and allow views to execute
>> as a different user that has access to source data, you have a powerful way
>> to represent your data to all levels within your org.
>> Also: Since I am just brain storming here, I'd love to hear what others
>> maybe doing around this area. Perhaps the Hive User Community can come up
>> with a strategic plan, while at the same time share some shorter term
>> workarounds.
>> Thanks!
Sanjay Subramanian 2013-05-16, 21:19
John Omernik 2013-05-16, 21:49