Machine learning-guided evaluation of extraction and simulation methods for cancer patient-specific metabolic models

Comput Struct Biotechnol J. 2022 Jun 15;20:3041-3052. doi: 10.1016/j.csbj.2022.06.027. eCollection 2022.


Genome-scale metabolic model (GEM) has been established as an important tool to study cellular metabolism at a systems level by predicting intracellular fluxes. With the advent of generic human GEMs, they have been increasingly applied to a range of diseases, often for the objective of predicting effective metabolic drug targets. Cancer is a representative disease where the use of GEMs has proved to be effective, partly due to the massive availability of patient-specific RNA-seq data. When using a human GEM, so-called context-specific GEM needs to be developed first by using cell-specific RNA-seq data. Biological validity of a context-specific GEM highly depends on both model extraction method (MEM) and model simulation method (MSM). However, while MEMs have been thoroughly examined, MSMs have not been systematically examined, especially, when studying cancer metabolism. In this study, the effects of pairwise combinations of three MEMs and five MSMs were evaluated by examining biological features of the resulting cancer patient-specific GEMs. For this, a total of 1,562 patient-specific GEMs were reconstructed, and subjected to machine learning-guided and biological evaluations to draw robust conclusions. Noteworthy observations were made from the evaluation, including the high performance of two MEMs, namely rank-based 'task-driven Integrative Network Inference for Tissues' (tINIT) or 'Gene Inactivity Moderated by Metabolism and Expression' (GIMME), paired with least absolute deviation (LAD) as a MSM, and relatively poorer performance of flux balance analysis (FBA) and parsimonious FBA (pFBA). Insights from this study can be considered as a reference when studying cancer metabolism using patient-specific GEMs.

Keywords: 1D CNN, one-dimensional convolutional neural network; Cancer patient-specific metabolic model; E-Flux2, E-Flux method combined with minimization of L2norm; Evaluation; FBA, flux balance analysis; GEM, genome-scale metabolic model; GIMME, Gene Inactivity Moderated by Metabolism and Expression; GPR, gene-protein-reaction; Genome-scale metabolic model; LAD, least absolute deviation; MEM, model extraction method; MSM, model simulation method; Machine learning; Model extraction method; Model simulation method; SPOT, Simplified Pearson cOrrelation with Transcriptomic data; pFBA, parsimonious flux balance analysis; t-SNE, t-distributed stochastic neighbor embedding; tINIT, task-driven Integrative Network Inference for Tissues.

Associated data

  • figshare/10.6084/m9.figshare.19810927.v1