Predicting pathways for old and new metabolites through clustering

J Theor Biol. 2024 Feb 7:578:111684. doi: 10.1016/j.jtbi.2023.111684. Epub 2023 Dec 3.

Abstract

The diverse metabolic pathways are fundamental to all living organisms, as they harvest energy, synthesize biomass components, produce molecules to interact with the microenvironment, and neutralize toxins. While the discovery of new metabolites and pathways continues, the prediction of pathways for new metabolites can be challenging. It can take vast amounts of time to elucidate pathways for new metabolites; thus, according to HMDB (Human Metabolome Database), only 60% of metabolites get assigned to pathways. Here, we present an approach to identify pathways based on metabolite structure. We extracted 201 features from SMILES annotations and identified new metabolites from PubMed abstracts and HMDB. After applying clustering algorithms to both groups of features, we quantified correlations between metabolites, and found the clusters accurately linked 92% of known metabolites to their respective pathways. Thus, this approach could be valuable for predicting metabolic pathways for new metabolites.

Keywords: AdaBoostClassifier; K-mode clustering; K-prototype clustering; Metabolites prediction; Pathways prediction.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Databases, Factual
  • Humans
  • Metabolic Networks and Pathways*
  • Metabolome*
  • Metabolomics