Evolutionary coupling analysis identifies the impact of disease-associated variants at less-conserved sites

Nucleic Acids Res. 2019 Sep 19;47(16):e94. doi: 10.1093/nar/gkz536.


Genome-wide association studies have discovered a large number of genetic variants in human patients with the disease. Thus, predicting the impact of these variants is important for sorting disease-associated variants (DVs) from neutral variants. Current methods to predict the mutational impacts depend on evolutionary conservation at the mutation site, which is determined using homologous sequences and based on the assumption that variants at well-conserved sites have high impacts. However, many DVs at less-conserved but functionally important sites cannot be predicted by the current methods. Here, we present a method to find DVs at less-conserved sites by predicting the mutational impacts using evolutionary coupling analysis. Functionally important and evolutionarily coupled sites often have compensatory variants on cooperative sites to avoid loss of function. We found that our method identified known intolerant variants in a diverse group of proteins. Furthermore, at less-conserved sites, we identified DVs that were not identified using conservation-based methods. These newly identified DVs were frequently found at protein interaction interfaces, where species-specific mutations often alter interaction specificity. This work presents a means to identify less-conserved DVs and provides insight into the relationship between evolutionarily coupled sites and human DVs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Biological Evolution
  • Cardiovascular Diseases / diagnosis
  • Cardiovascular Diseases / genetics*
  • Conserved Sequence
  • Databases, Protein
  • Endocrine System Diseases / diagnosis
  • Endocrine System Diseases / genetics*
  • Eye Diseases / diagnosis
  • Eye Diseases / genetics*
  • Genetic Predisposition to Disease
  • Genome, Human
  • Genome-Wide Association Study
  • Hematologic Diseases / diagnosis
  • Hematologic Diseases / genetics*
  • Humans
  • Metabolic Diseases / diagnosis
  • Metabolic Diseases / genetics*
  • Mutation
  • Neoplasms / diagnosis
  • Neoplasms / genetics*
  • Nervous System Diseases / diagnosis
  • Nervous System Diseases / genetics*
  • Principal Component Analysis
  • Protein Binding
  • Protein Interaction Domains and Motifs
  • Sequence Alignment
  • Sequence Homology, Amino Acid