Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Security and data design advice on structuring data on accumulo


+
Edmon Begoli 2012-08-08, 20:08
+
Marc Parisi 2012-08-08, 22:29
+
Edmon Begoli 2012-08-09, 19:44
+
Josh Elser 2012-08-10, 02:19
+
Adam Fuchs 2012-08-10, 12:52
+
Benson Margulies 2012-08-10, 12:56
+
Adam Fuchs 2012-08-10, 13:02
+
David Medinets 2012-08-10, 13:54
+
Edmon Begoli 2012-08-10, 14:47
+
Josh Elser 2012-08-10, 15:55
+
Adam Fuchs 2012-08-10, 16:05
+
Josh Elser 2012-08-10, 16:28
+
David Medinets 2012-08-11, 03:38
+
Edmon Begoli 2012-08-10, 16:33
+
Marc Parisi 2012-08-10, 17:00
Copy link to this message
-
Re: Security and data design advice on structuring data on accumulo
Christopher Tubbs 2012-08-11, 01:05
I think an important take-away here (so far) is that you can't just
use "doctor" as a role... because that doesn't encapsulate all the
security considerations. Doctor X doesn't get to see patient Y's data,
unless X is Y's doctor, or Y has signed a release for him/her to see
it. So, "doctorOf<Y>" is an essential consideration. If this was all
that was encapsulated, then the labels would grow roughly linearly
with the number of patients, not the number of "users" (if patients
happen to be users, that's simply a coincidence).

Since patient privacy is primarily what is being protected, I'd make
the roles relative to the patient:
doctorOfPatientX
familyMemberOfPatientX
isPatientX
lawyerOfPatientX
insurerOfPatientX
nurseOfPatientX
etc...

So, the roles would scale n*m, where n is the number of patients, and
m is roughly a fixed set of roles relative to each patient (m should
be pretty small).

You could put the patient in the row, but then you're relying on an
external system to filter the data (constrain the query) based on
roles *that* system understands. The built-in Accumulo roles would
simply constrain that external query system.

On Fri, Aug 10, 2012 at 1:00 PM, Marc Parisi <[EMAIL PROTECTED]> wrote:
> My suggestion of roles was to have a finite number of roles, with a finite
> number of actions. you would only store auths for those roles and actions.
> another lookup mechanism, in my system, will determine which user to use (
> as I recall. i don't have the code in front of me ). I did mention something
> about putting an id ( a key id perhaps ) in the CV; however, this could be
> moved elsewhere.
>
> doctor is a role. Dr. Parisi is not a role, it's a lookup to see if Parisi
> is a doctor, if so use that user ( role ). The doctor user would have the cv
> to see the user visibility. With the cryptographic hash in the cv, the goal
> was to limit which patients a doctor could see, but I can just as easily put
> that in the row to enforce that limitation.
>
> hopefully that makes sense.
>
> On Fri, Aug 10, 2012 at 12:33 PM, Edmon Begoli <[EMAIL PROTECTED]> wrote:
>>
>> > But that's not really n*m, since it only specifies me by name. This
>> > should
>> > be roughly linear with users, no?
>>
>> Correct.
>>
>> On Fri, Aug 10, 2012 at 12:05 PM, Adam Fuchs <[EMAIL PROTECTED]> wrote:
>> > But that's not really n*m, since it only specifies me by name. This
>> > should
>> > be roughly linear with users, no?
>> >
>> > There is definitely a reliance on some external service managing the
>> > roles
>> > that docs are in, but this should be tractable.
>> >
>> > Adam
>> >
>> > On Aug 10, 2012 11:56 AM, "Josh Elser" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> That's what I meant, user*doctors.
>> >>
>> >> It's not enough to say "healthteam", you have to qualify it by user
>> >> too:
>> >> "adamhealthteam".
>> >>
>> >> On 8/10/12 9:02 AM, Adam Fuchs wrote:
>> >>
>> >> I guess I should have specified that the access time labels should be
>> >> used
>> >> in conjunction with the role labels, like
>> >>
>> >> "(adamsHealthTeam&(regularCheckup|illnessEvaluation))|(massStateResearcher&populationStudy)".
>> >>
>> >> Adam
>> >>
>> >> On Aug 10, 2012 8:56 AM, "Benson Margulies" <[EMAIL PROTECTED]>
>> >> wrote:
>> >>>
>> >>> On Fri, Aug 10, 2012 at 8:52 AM, Adam Fuchs <[EMAIL PROTECTED]> wrote:
>> >>> > Not sure I understand why this gets into n*m roles. Can you
>> >>> > elaborate?
>> >>> >
>> >>> > The question of when your physician should have access seems like it
>> >>> > could
>> >>> > be represented by just a few labels, like "regularCheckup",
>> >>> > "illnessEvaluation", and "populationStudy". Those labels could then
>> >>> > be
>> >>> > tied
>> >>> > to an auditing system that could verify appropriateness of access
>> >>> > over
>> >>> > time.
>> >>>
>> >>> And if you change doctors? Maybe that's a job for some sort of
>> >>> role/group
>> >>> model.
>> >>>
>> >>>
>> >>> >
>> >>> > Adam
>> >>> >
>> >>> > On Aug 9, 2012 10:19 PM, "Josh Elser" <[EMAIL PROTECTED]> wrote:
+
Edmon Begoli 2012-08-10, 16:02