In silico analysis of missense substitutions using sequence-alignment based methods

Hum Mutat. 2008 Nov;29(11):1327-36. doi: 10.1002/humu.20892.


Genetic testing for mutations in high-risk cancer susceptibility genes often reveals missense substitutions that are not easily classified as pathogenic or neutral. Among the methods that can help in their classification are computational analyses. Predictions of pathogenic vs. neutral, or the probability that a variant is pathogenic, can be made based on: 1) inferences from evolutionary conservation using protein multiple sequence alignments (PMSAs) of the gene of interest for almost any missense sequence variant; and 2) for many variants, structural features of wild-type and variant proteins. These in silico methods have improved considerably in recent years. In this work, we review and/or make suggestions with respect to: 1) the rationale for using in silico methods to help predict the consequences of missense variants; 2) important aspects of creating PMSAs that are informative for classification; 3) specific features of algorithms that have been used for classification of clinically-observed variants; 4) validation studies demonstrating that computational analyses can have predictive values (PVs) of approximately 75 to 95%; 5) current limitations of data sets and algorithms that need to be addressed to improve the computational classifiers; and 6) how in silico algorithms can be a part of the "integrated analysis" of multiple lines of evidence to help classify variants. We conclude that carefully validated computational algorithms, in the context of other evidence, can be an important tool for classification of missense variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology
  • Evolution, Molecular
  • Genes, Neoplasm*
  • Genetic Predisposition to Disease
  • Genetic Testing / methods*
  • Genetic Variation
  • Humans
  • Molecular Sequence Data
  • Mutation, Missense*
  • Neoplastic Syndromes, Hereditary / classification*
  • Neoplastic Syndromes, Hereditary / genetics
  • Predictive Value of Tests
  • Sequence Alignment