Assessing the performance of in silico methods for predicting the pathogenicity of variants in the gene CHEK2, among Hispanic females with breast cancer

Hum Mutat. 2019 Sep;40(9):1612-1622. doi: 10.1002/humu.23849. Epub 2019 Aug 17.


The availability of disease-specific genomic data is critical for developing new computational methods that predict the pathogenicity of human variants and advance the field of precision medicine. However, the lack of gold standards to properly train and benchmark such methods is one of the greatest challenges in the field. In response to this challenge, the scientific community is invited to participate in the Critical Assessment for Genome Interpretation (CAGI), where unpublished disease variants are available for classification by in silico methods. As part of the CAGI-5 challenge, we evaluated the performance of 18 submissions and three additional methods in predicting the pathogenicity of single nucleotide variants (SNVs) in checkpoint kinase 2 (CHEK2) for cases of breast cancer in Hispanic females. As part of the assessment, the efficacy of the analysis method and the setup of the challenge were also considered. The results indicated that though the challenge could benefit from additional participant data, the combined generalized linear model analysis and odds of pathogenicity analysis provided a framework to evaluate the methods submitted for SNV pathogenicity identification and for comparison to other available methods. The outcome of this challenge and the approaches used can help guide further advancements in identifying SNV-disease relationships.

Keywords: CAGI; CHEK2; Hispanic women; SNV; breast cancer.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Breast Neoplasms / ethnology
  • Breast Neoplasms / genetics*
  • Case-Control Studies
  • Checkpoint Kinase 2 / genetics*
  • Computational Biology / methods*
  • Computer Simulation
  • Exome Sequencing
  • Female
  • Genetic Predisposition to Disease
  • Hispanic or Latino / genetics*
  • Humans
  • Linear Models
  • Middle Aged
  • Polymorphism, Single Nucleotide*
  • United States / ethnology


  • Checkpoint Kinase 2
  • CHEK2 protein, human