MicroRNA based Pan-Cancer Diagnosis and Treatment Recommendation

BMC Bioinformatics. 2017 Jan 13;18(1):32. doi: 10.1186/s12859-016-1421-y.

Abstract

Background: The current state-of-the-art in cancer diagnosis and treatment is not ideal; diagnostic tests are accurate but invasive, and treatments are "one-size fits-all" instead of being personalized. Recently, miRNA's have garnered significant attention as cancer biomarkers, owing to their ease of access (circulating miRNA in the blood) and stability. There have been many studies showing the effectiveness of miRNA data in diagnosing specific cancer types, but few studies explore the role of miRNA in predicting treatment outcome.

Methods: Here we go a step further, using tissue miRNA and clinical data across 21 cancers from the 'The Cancer Genome Atlas' (TCGA) database. We use machine learning techniques to create an accurate pan-cancer diagnosis system, and a prediction model for treatment outcomes. Finally, using these models, we create a web-based tool that diagnoses cancer and recommends the best treatment options.

Results: We achieved 97.2% accuracy for classification using a support vector machine classifier with radial basis. The accuracies improved to 99.9-100% when climbing up the embryonic tree and classifying cancers at different stages. We define the accuracy as the ratio of the total number of instances correctly classified to the total instances. The classifier also performed well, achieving greater than 80% sensitivity for many cancer types on independent validation datasets. Many miRNAs selected by our feature selection algorithm had strong previous associations to various cancers and tumor progression. Then, using miRNA, clinical and treatment data and encoding it in a machine-learning readable format, we built a prognosis predictor model to predict the outcome of treatment with 85% accuracy. We used this model to create a tool that recommends personalized treatment regimens. Both the diagnosis and prognosis model, incorporating semi-supervised learning techniques to improve their accuracies with repeated use, were uploaded online for easy access.

Conclusion: Our research is a step towards the final goal of diagnosing cancer and predicting treatment recommendations using non-invasive blood tests.

Keywords: Cancer diagnosis; Pan-cancer; TCGA dataset; miRNA.

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / analysis
  • Humans
  • MicroRNAs / genetics*
  • Neoplasms / diagnosis*
  • Neoplasms / therapy*
  • Prognosis
  • Support Vector Machine
  • Treatment Outcome

Substances

  • Biomarkers, Tumor
  • MicroRNAs