Accurate and Efficient Determination of Unknown Metabolites in Metabolomics by NMR-Based Molecular Motif Identification

Anal Chem. 2019 Dec 17;91(24):15686-15693. doi: 10.1021/acs.analchem.9b03849. Epub 2019 Dec 3.

Abstract

Knowledge of the chemical identity of metabolite molecules is critical for the understanding of the complex biological systems to which they belong. Since metabolite identities and their concentrations are often directly linked to the phenotype, such information can be used to map biochemical pathways and understand their role in health and disease. A very large number of metabolites however are still unknown; i.e., their spectroscopic signatures do not match those in existing databases, suggesting unknown molecule identification is both imperative and challenging. Although metabolites are structurally highly diverse, the majority shares a rather limited number of structural motifs, which are defined by sets of 1H and 13C chemical shifts of the same spin system. This allows one to characterize unknown metabolites by a divide-and-conquer strategy that identifies their structural motifs first. Here, we present the structural motif-based approach "SUMMIT Motif" for the de novo identification of unknown molecular structures in complex mixtures, without the need for extensive purification, using NMR in tandem with two newly curated NMR molecular structural motif metabolomics databases (MSMMDBs). For the identification of structural motif(s), first, the 1H and 13C chemical shifts of all the individual spin systems are extracted from 2D and 3D NMR spectra of the complex mixture. Next, the molecular structural motifs are identified by querying these chemical shifts against the new MSMMDBs. One database, COLMAR MSMMDB, was derived from experimental NMR chemical shifts of known metabolites taken from the COLMAR metabolomics database, while the other MSMMDB, pNMR MSMMDB, is based on predicted chemical shifts of metabolites of several existing large metabolomics databases. For molecules consisting of multiple spin systems, spin systems are connected via long-range scalar J-couplings. When this motif-based identification method was applied to the hydrophilic extract of mouse bile fluid, two unknown metabolites could be successfully identified. This approach is both accurate and efficient for the identification of unknown metabolites and hence enables the discovery of new biochemical processes and potential biomarkers.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bile / metabolism*
  • Biomarkers / analysis
  • Biomarkers / metabolism*
  • Complex Mixtures / analysis
  • Complex Mixtures / metabolism*
  • Databases, Factual
  • Escherichia coli / metabolism*
  • Magnetic Resonance Spectroscopy / methods*
  • Metabolome*
  • Mice

Substances

  • Biomarkers
  • Complex Mixtures