Predicting the functional impact of protein mutations: application to cancer genomics

Nucleic Acids Res. 2011 Sep 1;39(17):e118. doi: 10.1093/nar/gkr407. Epub 2011 Jul 3.


As large-scale re-sequencing of genomes reveals many protein mutations, especially in human cancer tissues, prediction of their likely functional impact becomes important practical goal. Here, we introduce a new functional impact score (FIS) for amino acid residue changes using evolutionary conservation patterns. The information in these patterns is derived from aligned families and sub-families of sequence homologs within and between species using combinatorial entropy formalism. The score performs well on a large set of human protein mutations in separating disease-associated variants (∼19 200), assumed to be strongly functional, from common polymorphisms (∼35 600), assumed to be weakly functional (area under the receiver operating characteristic curve of ∼0.86). In cancer, using recurrence, multiplicity and annotation for ∼10 000 mutations in the COSMIC database, the method does well in assigning higher scores to more likely functional mutations ('drivers'). To guide experimental prioritization, we report a list of about 1000 top human cancer genes frequently mutated in one or more cancer types ranked by likely functional impact; and, an additional 1000 candidate cancer genes with rare but likely functional mutations. In addition, we estimate that at least 5% of cancer-relevant mutations involve switch of function, rather than simply loss or gain of function.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Databases, Genetic
  • Genes, Neoplasm*
  • Genes, p53
  • Genomics / methods
  • Humans
  • Mutation, Missense*
  • Neoplasm Proteins / genetics*
  • Neoplasm Proteins / physiology
  • Neoplasms / genetics*
  • Polymorphism, Genetic
  • Sequence Alignment
  • Sequence Analysis, Protein


  • Neoplasm Proteins