Prioritizing predictive biomarkers for gene essentiality in cancer cells with mRNA expression data and DNA copy number profile

Bioinformatics. 2018 Dec 1;34(23):3975-3982. doi: 10.1093/bioinformatics/bty467.

Abstract

Motivation: Finding driver genes that are responsible for the aberrant proliferation rate of cancer cells is informative for both cancer research and the development of targeted drugs. The established experimental and computational methods are labor-intensive. To make algorithms feasible in real clinical settings, methods that can predict driver genes using less experimental data are urgently needed.

Results: We designed an effective feature selection method and used Support Vector Machines (SVM) to predict the essentiality of the potential driver genes in cancer cell lines with only 10 genes as features. The accuracy of our predictions was the highest in the Broad-DREAM Gene Essentiality Prediction Challenge. We also found a set of genes whose essentiality could be predicted much more accurately than others, which we called Accurately Predicted (AP) genes. Our method can serve as a new way of assessing the essentiality of genes in cancer cells.

Availability and implementation: The raw data that support the findings of this study are available at Synapse. https://www.synapse.org/#! Synapse: syn2384331/wiki/62825. Source code is available at GitHub. https://github.com/GuanLab/DREAM-Gene-Essentiality-Challenge.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Computational Biology
  • DNA Copy Number Variations*
  • Genes, Neoplasm*
  • Humans
  • RNA, Messenger / genetics
  • Software*

Substances

  • Biomarkers, Tumor
  • RNA, Messenger