|
|
-
Passing scan authorizations that exceed the Accumulo user's authorizations
John Stoneham 2011-12-15, 18:50
Hi,
I'm wondering about the expected behavior when creating a Scanner with authorizations that exceed the Accumulo user's authorizations. (For example, if some authentication mechanism gave a user AUTH1, AUTH2, and AUTH3, but the particular Accumulo usr in use has had AUTH3 removed from its authorizations temporarily.) Current behavior on the 1.3 line is to throw an exception if a scan is attempted with any authorizations which the Accumulo user does not possess.
The docs are inconsistent. The manual on the 1.4 line reads: When a user creates a scanner a set of Authorizations is passed. If the authorizations passed to the scanner are not a subset of the users authorizations, then an exception will be thrown.
However, the Javadocs on Connector.getBatchScanner read: A set of authorization labels that will be checked against the column visibility of each key inorder to filter data. The authorizations passed in for scanning are intersected with the accumulo users set of authorizations. So if the accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are passed, then (A2) will be used for the scan.
As an Accumulo user I'd prefer the behavior documented in getBatchScanner (that is, intersect on the server side the Accumulo user's authorizations with the authorizations passed). In that situation, I can safely pass all the authorizations my end-user might have, including any unexpected new (or dynamic) ones that weren't known when I started the application. Updating the Accumulo user's authorizations (addition or removal) would not require an application restart.
With the current situation, I have the following less happy scenarios to choose from: 1) retrieve the Accumulo user's authorizations at startup and perform this intersection logic in application code each time. New authorizations added to the Accumulo user won't be effective until an application restart. Authorizations removed from the Accumulo user have the potential to cause application errors until an application restart. 2) retrieve the Accumulo user's authorizations periodically and do the same. Same characteristics as #1 except that the time window is reduced. 3) retrieve the Accumulo user's authorizations for each scan, then intersect myself. Adds an extra round trip to every scan, and there's a race condition if auths are being modified simultaneously. 4) whitelist authorizations coming from authentication layer to ones I know are on the Accumulo user, keep the whitelist config in sync with the server config (or retrieve them at startup) and just take down the application before any authorization changes.
I thought I'd ask the list for thoughts on this before I file an issue. Perhaps there are constraints or solutions I haven't thought of.
Thanks,
- John
-- John Stoneham [EMAIL PROTECTED]
-
Re: Passing scan authorizations that exceed the Accumulo user's authorizations
Keith Turner 2011-12-15, 19:16
On Thu, Dec 15, 2011 at 1:50 PM, John Stoneham <[EMAIL PROTECTED]> wrote: > Hi, > > I'm wondering about the expected behavior when creating a Scanner with > authorizations that exceed the Accumulo user's authorizations. (For example, > if some authentication mechanism gave a user AUTH1, AUTH2, and AUTH3, but > the particular Accumulo usr in use has had AUTH3 removed from its > authorizations temporarily.) Current behavior on the 1.3 line is to throw an > exception if a scan is attempted with any authorizations which the Accumulo > user does not possess. > > The docs are inconsistent. The manual on the 1.4 line reads: When a user > creates a scanner a set of Authorizations is passed. If the authorizations > passed to the scanner are not a subset of the users authorizations, then an > exception will be thrown. > > However, the Javadocs on Connector.getBatchScanner read: A set of > authorization labels that will be checked against the column visibility of > each key inorder to filter data. The authorizations passed in for scanning > are intersected with the accumulo users set of authorizations. So if the > accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are > passed, then (A2) will be used for the scan. > > As an Accumulo user I'd prefer the behavior documented in getBatchScanner > (that is, intersect on the server side the Accumulo user's authorizations > with the authorizations passed). In that situation, I can safely pass all > the authorizations my end-user might have, including any unexpected new (or > dynamic) ones that weren't known when I started the application. Updating > the Accumulo user's authorizations (addition or removal) would not require > an application restart. > > With the current situation, I have the following less happy scenarios to > choose from: > 1) retrieve the Accumulo user's authorizations at startup and perform this > intersection logic in application code each time. New authorizations added > to the Accumulo user won't be effective until an application restart. > Authorizations removed from the Accumulo user have the potential to cause > application errors until an application restart. > 2) retrieve the Accumulo user's authorizations periodically and do the same. > Same characteristics as #1 except that the time window is reduced. > 3) retrieve the Accumulo user's authorizations for each scan, then intersect > myself. Adds an extra round trip to every scan, and there's a race condition > if auths are being modified simultaneously. > 4) whitelist authorizations coming from authentication layer to ones I know > are on the Accumulo user, keep the whitelist config in sync with the server > config (or retrieve them at startup) and just take down the application > before any authorization changes. > > I thought I'd ask the list for thoughts on this before I file an issue. > Perhaps there are constraints or solutions I haven't thought of. > > Thanks, > > - John > > -- > John Stoneham > [EMAIL PROTECTED]
Another option is to retrieve Accumulo user auths when you get an exception. This make problems w/ keeping things in sync go away. Would be slightly painful would probably need to catch a RuntimeException and get its cause. Since the scanner implements the Iterator interface it can only throw Runtime exceptions.
As far as changing the behavior, it is something to consider. We decided to go with the current behavior after many many instances of people not getting data back and not knowing why (silent intersection was causing data to be dropped). Could make the behavior configurable. By default it throws an exception, but allow user to turn this off. What do you think about this? I think its worth opening a ticket for an continuing the conversation there.
I think the batch scanner documentation is jsut wrong. This is how it used to work, thats a documentation bug. I will open a bug for this.
-
Re: Passing scan authorizations that exceed the Accumulo user's authorizations
Keith Turner 2011-12-15, 19:30
One thought about the old behavior of intersection scan auths with user auths. This is similar to a file system that gives you a zero length file when trying to open a file you can not read.
On Thu, Dec 15, 2011 at 1:50 PM, John Stoneham <[EMAIL PROTECTED]> wrote: > Hi, > > I'm wondering about the expected behavior when creating a Scanner with > authorizations that exceed the Accumulo user's authorizations. (For example, > if some authentication mechanism gave a user AUTH1, AUTH2, and AUTH3, but > the particular Accumulo usr in use has had AUTH3 removed from its > authorizations temporarily.) Current behavior on the 1.3 line is to throw an > exception if a scan is attempted with any authorizations which the Accumulo > user does not possess. > > The docs are inconsistent. The manual on the 1.4 line reads: When a user > creates a scanner a set of Authorizations is passed. If the authorizations > passed to the scanner are not a subset of the users authorizations, then an > exception will be thrown. > > However, the Javadocs on Connector.getBatchScanner read: A set of > authorization labels that will be checked against the column visibility of > each key inorder to filter data. The authorizations passed in for scanning > are intersected with the accumulo users set of authorizations. So if the > accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are > passed, then (A2) will be used for the scan. > > As an Accumulo user I'd prefer the behavior documented in getBatchScanner > (that is, intersect on the server side the Accumulo user's authorizations > with the authorizations passed). In that situation, I can safely pass all > the authorizations my end-user might have, including any unexpected new (or > dynamic) ones that weren't known when I started the application. Updating > the Accumulo user's authorizations (addition or removal) would not require > an application restart. > > With the current situation, I have the following less happy scenarios to > choose from: > 1) retrieve the Accumulo user's authorizations at startup and perform this > intersection logic in application code each time. New authorizations added > to the Accumulo user won't be effective until an application restart. > Authorizations removed from the Accumulo user have the potential to cause > application errors until an application restart. > 2) retrieve the Accumulo user's authorizations periodically and do the same. > Same characteristics as #1 except that the time window is reduced. > 3) retrieve the Accumulo user's authorizations for each scan, then intersect > myself. Adds an extra round trip to every scan, and there's a race condition > if auths are being modified simultaneously. > 4) whitelist authorizations coming from authentication layer to ones I know > are on the Accumulo user, keep the whitelist config in sync with the server > config (or retrieve them at startup) and just take down the application > before any authorization changes. > > I thought I'd ask the list for thoughts on this before I file an issue. > Perhaps there are constraints or solutions I haven't thought of. > > Thanks, > > - John > > -- > John Stoneham > [EMAIL PROTECTED]
-
Re: Passing scan authorizations that exceed the Accumulo user's authorizations
Billie J Rinaldi 2011-12-16, 14:18
On Thursday, December 15, 2011 1:50:49 PM, "John Stoneham" <[EMAIL PROTECTED]> wrote: > As an Accumulo user I'd prefer the behavior documented in > getBatchScanner (that is, intersect on the server side the Accumulo > user's authorizations with the authorizations passed). In that > situation, I can safely pass all the authorizations my end-user might > have, including any unexpected new (or dynamic) ones that weren't > known when I started the application. Updating the Accumulo user's > authorizations (addition or removal) would not require an application > restart.
It sounds like you always want to scan with all the authorizations your user has. In that case, you don't need a list of all possible authorizations to pass in -- just pass in the user's actual authorizations, which can be retrieved with connector.securityOperations().getUserAuthorizations(user).
Billie
-
Re: Passing scan authorizations that exceed the Accumulo user's authorizations
John Stoneham 2011-12-16, 14:23
On Fri, Dec 16, 2011 at 9:18 AM, Billie J Rinaldi <[EMAIL PROTECTED] > wrote:
> It sounds like you always want to scan with all the authorizations your > user has. In that case, you don't need a list of all possible > authorizations to pass in -- just pass in the user's actual authorizations, > which can be retrieved with > connector.securityOperations().getUserAuthorizations(user). >
Currently, I want to scan using the intersection of my (human) user's authorizations and the Accumulo (application) user's authorizations. What you've listed is the call I'm using to get the Accumulo user's authorizations. But if my user were to have an authorization that, for some reason, had been removed from the Accumulo user, or that was available on one deployment or cluster but not on another, I'd have a problem unless I performed this intersection in the application.
-- John Stoneham [EMAIL PROTECTED]
|
|