Diagnostic yield of targeted next generation sequencing in various cancer types: an information-theoretic approach

Cancer Genet. 2015 Sep;208(9):441-7. doi: 10.1016/j.cancergen.2015.05.030. Epub 2015 May 29.


The information-theoretic concept of Shannon entropy can be used to quantify the information provided by a diagnostic test. We hypothesized that in tumor types with stereotyped mutational profiles, the results of NGS testing would yield lower average information than in tumors with more diverse mutations. To test this hypothesis, we estimated the entropy of NGS testing in various cancer types, using results obtained from clinical sequencing. A set of 238 tumors were subjected to clinical targeted NGS across all exons of 27 genes. There were 120 actionable variants in 109 cases, occurring in the genes KRAS, EGFR, PTEN, PIK3CA, KIT, BRAF, NRAS, IDH1, and JAK2. Sequencing results for each tumor were modeled as a dichotomized genotype (actionable mutation detected or not detected) for each of the 27 genes. Based upon the entropy of these genotypes, sequencing was most informative for colorectal cancer (3.235 bits of information/case) followed by high grade glioma (2.938 bits), lung cancer (2.197 bits), pancreatic cancer (1.339 bits), and sarcoma/STTs (1.289 bits). In the most informative cancer types, the information content of NGS was similar to surgical pathology examination (modeled at approximately 2-3 bits). Entropy provides a novel measure of utility for laboratory testing in general and for NGS in particular. This metric is, however, purely analytical and does not capture the relative clinical significance of the identified variants, which may also differ across tumor types.

Keywords: comparative effectiveness research; high-throughput nucleotide sequencing; information theory; molecular diagnostic techniques; neoplasms.

MeSH terms

  • Algorithms*
  • DNA Mutational Analysis / methods
  • Entropy
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Genetic
  • Neoplasms / diagnosis*
  • Neoplasms / genetics
  • Retrospective Studies
  • Sequence Analysis, DNA / methods*