Human Variome Project Quality Assessment Criteria for Variation Databases

Hum Mutat. 2016 Jun;37(6):549-58. doi: 10.1002/humu.22976. Epub 2016 Mar 21.

Abstract

Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance.

Keywords: Human Variome Project; LSDB; components of quality; database quality; gene variant databases; genetic variation; locus-specific variation databases; quality scheme.

MeSH terms

  • Databases, Genetic / standards*
  • Genetic Variation*
  • Genome, Human
  • Human Genome Project
  • Humans
  • Quality Control