Privacy-Preserving Linkage of Genomic and Clinical Data Sets

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1342-1348. doi: 10.1109/TCBB.2018.2855125. Epub 2018 Jul 30.


The capacity to link records associated with the same individual across data sets is a key challenge for data-driven research. The challenge is exacerbated by the potential inclusion of both genomic and clinical data in data sets that may span multiple legal jurisdictions, and by the need to enable re-identification in limited circumstances. Privacy-Preserving Record Linkage (PPRL) methods address these challenges. In 2016, the Interdisciplinary Committee of the International Rare Diseases Research Consortium (IRDiRC) launched a task team to explore approaches to PPRL. The task team is a collaboration with the Global Alliance for Genomics and Health (GA4GH) Regulatory and Ethics and Data Security Work Streams, and aims to prepare policy and technology standards to enable highly reliable linking of records associated with the same individual without disclosing their identity except under conditions in which the use of the data has led to information of importance to the individual's safety or health, and applicable law allows or requires the return of results. The PPRL Task Force has examined the ethico-legal requirements, constraints, and implications of PPRL, and has applied this knowledge to the exploration of technology methods and approaches to PPRL. This paper reports and justifies the findings and recommendations thus far.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Big Data
  • Computer Security*
  • Confidentiality*
  • Databases, Factual
  • Europe
  • Genetic Linkage
  • Genome, Human
  • Genomics*
  • Humans
  • Interdisciplinary Communication
  • Medical Informatics / methods*
  • Medical Informatics / standards
  • Rare Diseases / genetics
  • United States

Grants and funding