Functional impact of missense variants in BRCA1 predicted by supervised learning

PLoS Comput Biol. 2007 Feb 16;3(2):e26. doi: 10.1371/journal.pcbi.0030026. Epub 2006 Dec 28.


Many individuals tested for inherited cancer susceptibility at the BRCA1 gene locus are discovered to have variants of unknown clinical significance (UCVs). Most UCVs cause a single amino acid residue (missense) change in the BRCA1 protein. They can be biochemically assayed, but such evaluations are time-consuming and labor-intensive. Computational methods that classify and suggest explanations for UCV impact on protein function can complement functional tests. Here we describe a supervised learning approach to classification of BRCA1 UCVs. Using a novel combination of 16 predictive features, the algorithms were applied to retrospectively classify the impact of 36 BRCA1 C-terminal (BRCT) domain UCVs biochemically assayed to measure transactivation function and to blindly classify 54 documented UCVs. Majority vote of three supervised learning algorithms is in agreement with the assay for more than 94% of the UCVs. Two UCVs found deleterious by both the assay and the classifiers reveal a previously uncharacterized putative binding site. Clinicians may soon be able to use computational classifiers such as those described here to better inform patients. These classifiers can be adapted to other cancer susceptibility genes and systematically applied to prioritize the growing number of potential causative loci and variants found by large-scale disease association studies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Artificial Intelligence*
  • BRCA1 Protein / chemistry*
  • BRCA1 Protein / physiology*
  • Genetic Variation / genetics
  • Molecular Sequence Data
  • Mutation, Missense
  • Pattern Recognition, Automated
  • Sequence Alignment / methods
  • Sequence Analysis, Protein / methods*
  • Structure-Activity Relationship


  • BRCA1 Protein