Identifiability in biobanks: models, measures, and mitigation strategies

Hum Genet. 2011 Sep;130(3):383-92. doi: 10.1007/s00439-011-1042-5. Epub 2011 Jul 8.


The collection and sharing of person-specific biospecimens has raised significant questions regarding privacy. In particular, the question of identifiability, or the degree to which materials stored in biobanks can be linked to the name of the individuals from which they were derived, is under scrutiny. The goal of this paper is to review the extent to which biospecimens and affiliated data can be designated as identifiable. To achieve this goal, we summarize recent research in identifiability assessment for DNA sequence data, as well as associated demographic and clinical data, shared via biobanks. We demonstrate the variability of the degree of risk, the factors that contribute to this variation, and potential ways to mitigate and manage such risk. Finally, we discuss the policy implications of these findings, particularly as they pertain to biobank security and access policies. We situate our review in the context of real data sharing scenarios and biorepositories.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Biological Specimen Banks* / standards
  • Confidentiality*
  • Genetic Privacy
  • Guidelines as Topic
  • Information Dissemination
  • Public Opinion
  • Risk
  • Risk Management
  • Weights and Measures