Purpose: Several in silico tools have been shown to have reasonable research sensitivity and specificity for classifying sequence variants in coding regions. The recently developed combined annotation-dependent depletion (CADD) method generates predictive scores for single-nucleotide variants (SNVs) in all areas of the genome, including noncoding regions. We sought for non-coding variants to determine the clinical validity of common CADD scores.
Methods: We evaluated 12,391 unique SNVs in 624 patient samples submitted for germ-line mutation testing in a cancer-related gene panel. Stratifying by genomic region, we compared the distributions of CADD scores of rare SNVs, SNVs common in our patient population, and the null distribution of all possible SNVs.
Results: The median CADD scores of intronic and nonsynonymous variants were significantly different between rare and common SNVs (P < 0.0001). Despite these different distributions, no individual variants could be identified as plausibly causative among the rare intronic variants with the highest scores. The receiver-operating characteristics (ROC) area under the curve (AUC) for noncoding variants is modest, and the positive predictive value of CADD for intronic variants in panel testing was found to be 0.088.
Conclusion: Focused in silico scoring systems with much higher predictive value will be necessary for clinical genomic applications.Genet Med 18 12, 1269-1275.