Adjusting for background mutation frequency biases improves the identification of cancer driver genes

IEEE Trans Nanobioscience. 2013 Sep;12(3):150-7. doi: 10.1109/TNB.2013.2263391. Epub 2013 May 16.

Abstract

A common goal of tumor sequencing projects is finding genes whose mutations are selected for during tumor development. This is accomplished by choosing genes that have more non-synonymous mutations than expected from an estimated background mutation frequency. While this background frequency is unknown, it can be estimated using both the observed synonymous mutation frequency and the non-synonymous to synonymous mutation ratio. The synonymous mutation frequency can be determined across all genes or in a gene-specific manner. This choice introduces an interesting trade-off. A gene-specific frequency adjusts for an underlying mutation bias, but is difficult to estimate given missing synonymous mutation counts. Using a genome-wide synonymous frequency is more robust, but is less suited for adjusting biases. Studying four evaluation criteria for identifying genes with high non-synonymous mutation burden (reflecting preferential selection of expressed genes, genes with mutations in conserved bases, genes with many protein interactions, and genes that show loss of heterozygosity), we find that the gene-specific synonymous frequency is superior in the gene expression and protein interaction tests. In conclusion, the use of the gene-specific synonymous mutation frequency is well suited for assessing a gene's non-synonymous mutation burden.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Gene Expression Profiling
  • Genes, Neoplasm*
  • Humans
  • Loss of Heterozygosity
  • Melanoma / genetics*
  • Melanoma / metabolism
  • Models, Genetic*
  • Mutation / genetics
  • Mutation / radiation effects
  • Mutation Rate*
  • Tumor Cells, Cultured
  • Ultraviolet Rays