Recognition of early and late stages of bladder cancer using metabolites and machine learning

Metabolomics. 2019 Jun 20;15(7):94. doi: 10.1007/s11306-019-1555-9.

Abstract

Introduction: Bladder cancer (BCa) is one of the most common and aggressive cancers. It is the sixth most frequently occurring cancer in men and its rate of occurrence increases with age. The current method of BCa diagnosis includes a cystoscopy and biopsy. This process is expensive, unpleasant, and may have severe side effects. Recent growth in the power and accessibility of machine-learning software has allowed for the development of new, non-invasive diagnostic methods whose accuracy and sensitivity are uncompromising to function.

Objectives: The goal of this research was to elucidate the biomarkers including metabolites and corresponding genes for different stages of BCa, show their distinguishing and common features, and create a machine-learning model for classification of stages of BCa.

Methods: Sets of metabolites for early and late stages, as well as common for both stages were analyzed using MetaboAnalyst and Ingenuity® Pathway Analysis (IPA®) software. Machine-learning methods were utilized in the development of a binary classifier for early- and late-stage metabolites of BCa. Metabolites were quantitatively characterized using EDragon 1.0 software. The two modeling methods used are Multilayer Perceptron (MLP) and Stochastic Gradient Descent (SGD) with a logistic regression loss function.

Results: We explored metabolic pathways related to early-stage BCa (Galactose metabolism and Starch and sucrose metabolism) and to late-stage BCa (Glycine, serine, and threonine metabolism, Arginine and proline metabolism, Glycerophospholipid metabolism, and Galactose metabolism) as well as those common to both stages pathways. The central metabolite impacting the most cancerogenic genes (AKT, EGFR, MAPK3) in early stage is D-glucose, while late-stage BCa is characterized by significant fold changes in several metabolites: glycerol, choline, 13(S)-hydroxyoctadecadienoic acid, 2'-fucosyllactose. Insulin was also seen to play an important role in late stages of BCa. The best performing model was able to predict metabolite class with an accuracy of 82.54% and the area under precision-recall curve (PRC) of 0.84 on the training set. The same model was applied to three separate sets of metabolites obtained from public sources, one set of the late-stage metabolites and two sets of the early-stage metabolites. The model was better at predicting early-stage metabolites with accuracies of 72% (18/25) and 95% (19/20) on the early sets, and an accuracy of 65.45% (36/55) on the late-stage metabolite set.

Conclusion: By examining the biomarkers present in the urine samples of BCa patients as compared with normal patients, the biomarkers associated with this cancer can be pinpointed and lead to the elucidation of affected metabolic pathways that are specific to different stages of cancer. Development of machine-learning model including metabolites and their chemical descriptors made it possible to achieve considerable accuracy of prediction of stages of BCa.

Keywords: Biomarkers; Bladder cancer; Machine learning; Metabolic networks; Metabolomics.

MeSH terms

  • Amino Acids / metabolism
  • Area Under Curve
  • Biomarkers, Tumor / urine
  • ErbB Receptors / metabolism
  • Galactose / metabolism
  • Glycine / metabolism
  • Humans
  • Insulin / metabolism
  • Machine Learning*
  • Metabolic Networks and Pathways / genetics
  • Neoplasm Staging
  • Proto-Oncogene Proteins c-akt / metabolism
  • ROC Curve
  • Software
  • Urinary Bladder Neoplasms / metabolism
  • Urinary Bladder Neoplasms / pathology*

Substances

  • Amino Acids
  • Biomarkers, Tumor
  • Insulin
  • EGFR protein, human
  • ErbB Receptors
  • Proto-Oncogene Proteins c-akt
  • Glycine
  • Galactose