-答复: 答复: in security mode, one MR job visit two user's data
wang 2013-02-12, 03:17
I am very happy to get the response. Thank you.
Give read permission to group is also not ok, because that means other user
can use dfsclient read the data easily. So what I thought is one user's data
in hdfs, should only be 700.
Coming to what you said ACLs in HDFS, what my understanding is , the acl in
hdfs is too simple, it only have one owner, so if I set the permission to
700, how I can give other user the read permission?
Let me give more background :
We want to implement the hive security, we thoughts the user of hive should
be propagate to hdfs and mr, but currently, just using hiveserver's
We thoughts the user's table data in hdfs should only be 700, otherwise,
other user can directly use hdfsapi to get the data easily
In hive, one sql visit multiple user's data should be allowed, in rdb like
oracle, this requiremens is basic function.
So in hiveserver side, we will check whether user has permission to visit
other user's table, once true, it means one sql maybe visit multiple user's
table data in hdfs by mr job
According to what you said below, then it is difficult for this requirement.
I will think more. Thanks you , also welcome more suggestions.
发件人: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]] 代表 Robert
发送时间: 2013年2月11日 23:19
收件人: [EMAIL PROTECTED]
主题: Re: 答复: in security mode, one MR job visit two user's data
I think he is talking about using groups and read only permissions.
Once the table is loaded into hive you can make the files read only by a
group that both users share. The Hadoop code is really not setup to allow a
single job to pretend to be more then one user. You might be able to fake
it, but because the assumption has always been one user there are likely to
be other problems that you run into, even if you get the tokens to work. I
think the preferable alternative would be to work for true ACLs in HDFS.
Then you can set up an ACL to give read only access to the table for the one
user that needs it, and you don't have to set up a special HDFS group for
On 2/9/13 8:31 PM, "wang" <[EMAIL PROTECTED]> wrote:
>Thank your 's response~
>In hive, user can directly execute load path command, if the dir is
>accessible by two user, then, one user can directly load another user's
>data into his table. Also. User can execute dfs command directly
>through hiveserver. so the user's data in hdfs is better be 700.
>Whether it is possible I customize the TokenSelector? what i want is at
>job client , I got all user's delegation token, and in map task, it can
>choose the correct user's token according the pat it accessed.
>I am not sure whether I can achieve this or how much effort it
>required. I still think of this, welcome the guide from yours.
>发件人: [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED]] 代表
>发送时间: 2013年2月10日 0:21
>收件人: [EMAIL PROTECTED]
>主题: Re: in security mode, one MR job visit two user's data
>How about leveraging filesystem permissions so the user has access to
>On Feb 9, 2013, at 1:54 AM, "wang" <[EMAIL PROTECTED]> wrote:
>> In security mode, Is it possible in one mr job visit two user's data
>>in hdfs? Means: there are two maps in one job, one map read user1's
>>data, another read user2's data. As I know, before submit job,
>>jobclient get the delegation token for MR task, but in class
>>credentials, the tokenmap can only take one token for one type of
>>service. If I get user2's token, and add to credentials, the user1's
>will be overwrite.
>> Anyone met the same situation or someone can give some suggestions?
>> The background is in hive, one sql maybe visit different user's data.