WITER: a powerful method for estimation of cancer-driver genes using a weighted iterative regression modelling background mutation counts

Nucleic Acids Res. 2019 Sep 19;47(16):e96. doi: 10.1093/nar/gkz566.


Genomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modelling distribution of background mutation counts, existing statistical methods are often underpowered to discriminate cancer-driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER, http://grass.cgs.hku.hk/limx/witer or KGGSeq,http://grass.cgs.hku.hk/limx/kggseq/), to detect cancer-driver genes showing an excess of somatic mutations. By fitting the distribution of background mutation counts properly, this approach works well even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in most tested cancers. Applying this approach, we estimated 229 driver genes in 26 different types of cancers. In silico validation confirmed 78% of predicted genes as likely known drivers and many other genes as very likely new drivers for corresponding cancers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects and rescue of more genes missed by alternative tools in moderate or small samples.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking
  • Computer Simulation
  • Gene Expression Regulation, Neoplastic*
  • Genomics / methods
  • Genomics / statistics & numerical data*
  • Humans
  • Internet
  • Mutation
  • Neoplasm Proteins / classification
  • Neoplasm Proteins / genetics*
  • Neoplasm Proteins / metabolism
  • Neoplasms / classification
  • Neoplasms / diagnosis*
  • Neoplasms / genetics
  • Oncogenes*
  • Regression Analysis
  • Sample Size
  • Software*


  • Neoplasm Proteins