Assessing the accuracy of observer-reported ancestry in a biorepository linked to electronic medical records

Genet Med. 2010 Oct;12(10):648-50. doi: 10.1097/GIM.0b013e3181efe2df.


Purpose: The Vanderbilt DNA Databank (BioVU) is a biorepository that currently contains >80,000 DNA samples linked to electronic medical records. Although BioVU is a valuable source of samples and phenotypes for genetic association studies, it is unclear whether the administratively assigned race/ethnicity in BioVU can accurately describe and be used as a proxy for genetic ancestry.

Methods: We genotyped 360 single nucleotide polymorphisms on the Illumina DNA Test Panel containing ancestry informative markers in 1910 BioVU samples with observer-reported ancestry and 384 samples from the Multiple Sclerosis Genetics Group with self-reported ancestry. Genetic ancestry was inferred for all individuals using Structure 2.2.

Results: More than 98% of observer-reported European Americans were genetically inferred to have at least 60% European ancestry. Ninety-three percent of observer-reported African Americans were genetically inferred to be predominantly of African ancestry. We determined that the concordance of observer-reported race/ethnicity and inferred genetic ancestry was not significantly different from that of self-reported race/ethnicity in either population (P = 0.09 and 0.94 in European Americans and African Americans, respectively).

Conclusions: Observer-reported race/ethnicity for European Americans and African Americans approximates genetic ancestry as well as self-reported race/ethnicity, making biorepositories linked to electronic medical records such as BioVU a viable source of DNA samples for future large-scale genetic association studies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Specimen Banks*
  • Blacks
  • Databases, Genetic
  • Databases, Nucleic Acid
  • Electronic Health Records*
  • Ethnicity* / genetics
  • Genetic Association Studies
  • Genetic Markers
  • Genotype
  • Humans
  • Medical Records Systems, Computerized
  • Observer Variation*
  • Phenotype
  • Racial Groups* / genetics
  • Self Report
  • Whites


  • Genetic Markers