A standardised graphic method for describing data privacy frameworks in primary care research using a flexible zone model

Int J Med Inform. 2014 Dec;83(12):941-57. doi: 10.1016/j.ijmedinf.2014.08.009. Epub 2014 Sep 3.


Purpose: To develop a model describing core concepts and principles of data flow, data privacy and confidentiality, in a simple and flexible way, using concise process descriptions and a diagrammatic notation applied to research workflow processes. The model should help to generate robust data privacy frameworks for research done with patient data.

Methods: Based on an exploration of EU legal requirements for data protection and privacy, data access policies, and existing privacy frameworks of research projects, basic concepts and common processes were extracted, described and incorporated into a model with a formal graphical representation and a standardised notation. The Unified Modelling Language (UML) notation was enriched by workflow and own symbols to enable the representation of extended data flow requirements, data privacy and data security requirements, privacy enhancing techniques (PET) and to allow privacy threat analysis for research scenarios.

Results: Our model is built upon the concept of three privacy zones (Care Zone, Non-care Zone and Research Zone) containing databases, data transformation operators, such as data linkers and privacy filters. Using these model components, a risk gradient for moving data from a zone of high risk for patient identification to a zone of low risk can be described. The model was applied to the analysis of data flows in several general clinical research use cases and two research scenarios from the TRANSFoRm project (e.g., finding patients for clinical research and linkage of databases). The model was validated by representing research done with the NIVEL Primary Care Database in the Netherlands.

Conclusions: The model allows analysis of data privacy and confidentiality issues for research with patient data in a structured way and provides a framework to specify a privacy compliant data flow, to communicate privacy requirements and to identify weak points for an adequate implementation of data privacy.

Keywords: Anonymisation; Confidentiality; Data linkage; Medical research; Privacy; Pseudonymisation; Zones.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomedical Research*
  • Computer Security*
  • Confidentiality
  • Databases, Factual
  • Health Policy
  • Humans
  • Medical Records Systems, Computerized*
  • Models, Theoretical*
  • Netherlands
  • Patient Care
  • Primary Health Care / standards*
  • Privacy / legislation & jurisprudence*