Biological relevance of computationally predicted pathogenicity of noncoding variants

Nat Commun. 2019 Jan 18;10(1):330. doi: 10.1038/s41467-018-08270-y.


Computational prediction of the phenotypic propensities of noncoding single nucleotide variants typically combines annotation of genomic, functional and evolutionary attributes into a single score. Here, we evaluate if the claimed excellent accuracies of these predictions translate into high rates of success in addressing questions important in biological research, such as fine mapping causal variants, distinguishing pathogenic allele(s) at a given position, and prioritizing variants for genetic risk assessment. A significant disconnect is found to exist between the statistical modelling and biological performance of predictive approaches. We discuss fundamental reasons underlying these deficiencies and suggest that future improvements of computational predictions need to address confounding of allelic, positional and regional effects as well as imbalance of the proportion of true positive variants in candidate lists.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology
  • Disease / genetics*
  • Evolution, Molecular
  • Genome-Wide Association Study
  • Humans
  • Machine Learning
  • Mammals / genetics
  • Models, Statistical*
  • Polymorphism, Single Nucleotide
  • RNA, Untranslated / genetics*


  • RNA, Untranslated