Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Security and data design advice on structuring data on accumulo

Copy link to this message
Re: Security and data design advice on structuring data on accumulo
The underlying issue I'm poking at is this:

Pluggable authorizations systems I've seen attached to Accumulo in the
past have operated in the following fashion: A single superuser in
Accumulo has all of the authorizations for data stored in Accumulo. The
authorization system determines the correct Accumulo Authorizations for
the current user and intersects the user's Authorizations with the
superuser's Authorizations (read as: all Authorizations) to perform a
scan over Accumulo at the desired level. Thus, end-users don't have
accounts on Accumulo; user queries run as a the superuser.

Back to the current example, as you said, the number of "groups" should
grow roughly linearly to the number of users; however, this now requires
that every user has an Accumulo account. The difference is that a doctor
will be in many users' groups (e.g. you and I could share a doctor). To
my understanding, all of this user/authorization information is stored
inside of ZooKeeper. It seems less-than-ideal to me to store user
accounts for every patient and every doctor, where every doctor has many
"roles", but it also appears intractable to me to have a
single-superuser with all auths (as previously outlined).

I'm sure a user-roles approach could work to a point; but I feel like
there is potential for a much more elegant solution. I'm curious if
others have had thoughts about this.

On 8/10/12 12:05 PM, Adam Fuchs wrote:
> But that's not really n*m, since it only specifies me by name. This
> should be roughly linear with users, no?
> There is definitely a reliance on some external service managing the
> roles that docs are in, but this should be tractable.
> Adam
> On Aug 10, 2012 11:56 AM, "Josh Elser" <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>     That's what I meant, user*doctors.
>     It's not enough to say "healthteam", you have to qualify it by
>     user too: "adamhealthteam".
>     On 8/10/12 9:02 AM, Adam Fuchs wrote:
>>     I guess I should have specified that the access time labels
>>     should be used in conjunction with the role labels, like
>>     "(adamsHealthTeam&(regularCheckup|illnessEvaluation))|(massStateResearcher&populationStudy)".
>>     Adam
>>     On Aug 10, 2012 8:56 AM, "Benson Margulies"
>>     <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>         On Fri, Aug 10, 2012 at 8:52 AM, Adam Fuchs
>>         <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>         > Not sure I understand why this gets into n*m roles. Can you
>>         elaborate?
>>         >
>>         > The question of when your physician should have access
>>         seems like it could
>>         > be represented by just a few labels, like "regularCheckup",
>>         > "illnessEvaluation", and "populationStudy". Those labels
>>         could then be tied
>>         > to an auditing system that could verify appropriateness of
>>         access over time.
>>         And if you change doctors? Maybe that's a job for some sort
>>         of role/group model.
>>         >
>>         > Adam
>>         >
>>         > On Aug 9, 2012 10:19 PM, "Josh Elser" <[EMAIL PROTECTED]
>>         <mailto:[EMAIL PROTECTED]>> wrote:
>>         >>
>>         >> I've thought quite a bit about the approach you've
>>         outlined previously..
>>         >>
>>         >> The main caveat I've always struggled to overcome is how
>>         to encapsulate
>>         >> *when* a physician should have access to your records.
>>         This expands the
>>         >> problem into n*m roles which becomes difficult to manage
>>         inside Accumulo,
>>         >> especially as time elapses.
>>         >>
>>         >> On 8/8/2012 6:29 PM, Marc Parisi wrote:
>>         >>>
>>         >>> Just some ideas and thoughts....
>>         >>>
>>         >>> With a system I'm building I have code to take care of
>>         user roles. Roles
>>         >>> will define visibilities, how analysis is performed,