Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - Security and data design advice on structuring data on accumulo


Copy link to this message
-
Re: Security and data design advice on structuring data on accumulo
Benson Margulies 2012-08-10, 12:56
On Fri, Aug 10, 2012 at 8:52 AM, Adam Fuchs <[EMAIL PROTECTED]> wrote:
> Not sure I understand why this gets into n*m roles. Can you elaborate?
>
> The question of when your physician should have access seems like it could
> be represented by just a few labels, like "regularCheckup",
> "illnessEvaluation", and "populationStudy". Those labels could then be tied
> to an auditing system that could verify appropriateness of access over time.

And if you change doctors? Maybe that's a job for some sort of role/group model.
>
> Adam
>
> On Aug 9, 2012 10:19 PM, "Josh Elser" <[EMAIL PROTECTED]> wrote:
>>
>> I've thought quite a bit about the approach you've outlined previously..
>>
>> The main caveat I've always struggled to overcome is how to encapsulate
>> *when* a physician should have access to your records. This expands the
>> problem into n*m roles which becomes difficult to manage inside Accumulo,
>> especially as time elapses.
>>
>> On 8/8/2012 6:29 PM, Marc Parisi wrote:
>>>
>>> Just some ideas and thoughts....
>>>
>>> With a system I'm building I have code to take care of user roles. Roles
>>> will define visibilities, how analysis is performed, information
>>> sharing, etc. I have a particular role for sharing. I also have an area
>>> of interest, usually assigned to a physician role, therefore only a
>>> physician's office can see certain data from it. The data corresponding
>>> to a given person can be accessed by that person ( if they have app
>>> access ), the physician that created it, and other physicians ( with a
>>> different area of interest ) with whom the user wants to share their
>>> data. Each area of interest will be cryptographically secured. Our
>>> approach will utilize multiple crypto technologies. I would suggest
>>> making crypto your last stop. Focus on getting
>>> the visibility hierarchy designed. HIPAA requirements can come later.
>>>
>>> In my approach, there is no elevation of fields per se. Instead, there
>>> are visibiilities for all assigned parties,so in my case it is a matter
>>> of labeling. The data can have hierarchies, and each hierarchy has
>>> different labels to control access.
>>>
>>> " Patient demographic fields are PHI (personal health information) and
>>> these should not be visible to all who want to perform analysis, but
>>> only to main administrators,
>>> patient and maybe physician. I assume these would have to have
>>> separate authorization label. "
>>>
>>> Yes. I think this is where roles will help. Assign roles and
>>> visibilities to those roles. As of right now, I'm putting ephemeral data
>>> in my visibilities ( user ID for a physician, among other things ). I
>>> will probably move this to the qualifier and take a more simple approach
>>> to visibilities.
>>>
>>> Each role has different actions. Right now I have four actions; syncing,
>>> querying, deleting, and sharing. You don't have to capture actions, but
>>> you might want to limit how the roles of users vary, and I think
>>> modeling the security actions within each role is an excellent way to do
>>> so.
>>>
>>>
>>> On Wed, Aug 8, 2012 at 4:08 PM, Edmon Begoli <[EMAIL PROTECTED]
>>> <mailto:[EMAIL PROTECTED]>> wrote:
>>>
>>>     I am trying to model the healthcare claim on accumulo and I want to
>>>     lay it out so that it:
>>>
>>>     A. Accurately reflects the structure of the claim
>>>
>>>     B. I could have controls finely applied to different sections of the
>>>     document
>>>
>>>     I am simplifying matter but claim contains claim document
>>> identifiers,
>>>     demographics of the patient, and line items for the procedures
>>>     performed:
>>>
>>>     claim identifier, data submitted, data processed, state of origin,
>>> ...
>>>     patient name, dob, location, other identifiers
>>>     procedure 1 code, procedure 1 provider, procedure 1 cost, ...
>>>     ...
>>>     procedure n code, procedure n provider, procedure n cost, ...
>>>
>>>
>>>     Patient demographic fields are PHI (personal health information) and