Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear all
Review
. 2016 Dec 19;29(12):1956-1975.
doi: 10.1021/acs.chemrestox.6b00179. Epub 2016 Oct 12.

Computational Metabolomics: A Framework for the Million Metabolome

Affiliations
Free PMC article
Review

Computational Metabolomics: A Framework for the Million Metabolome

Karan Uppal et al. Chem Res Toxicol. .
Free PMC article

Abstract

"Sola dosis facit venenum." These words of Paracelsus, "the dose makes the poison", can lead to a cavalier attitude concerning potential toxicities of the vast array of low abundance environmental chemicals to which humans are exposed. Exposome research teaches that 80-85% of human disease is linked to environmental exposures. The human exposome is estimated to include >400,000 environmental chemicals, most of which are uncharacterized with regard to human health. In fact, mass spectrometry measures >200,000 m/z features (ions) in microliter volumes derived from human samples; most are unidentified. This crystallizes a grand challenge for chemical research in toxicology: to develop reliable and affordable analytical methods to understand health impacts of the extensive human chemical experience. To this end, there appears to be no choice but to abandon the limitations of measuring one chemical at a time. The present review looks at progress in computational metabolomics to provide probability-based annotation linking ions to known chemicals and serve as a foundation for unambiguous designation of unidentified ions for toxicologic study. We review methods to characterize ions in terms of accurate mass m/z, chromatographic retention time, correlation of adduct, isotopic and fragment forms, association with metabolic pathways and measurement of collision-induced dissociation products, collision cross section, and chirality. Such information can support a largely unambiguous system for documenting unidentified ions in environmental surveillance and human biomonitoring. Assembly of this data would provide a resource to characterize and understand health risks of the array of low-abundance chemicals to which humans are exposed.

Figures

Figure 1
Figure 1
Increase in metabolomics publications over the last 15 years. Searches of PubMed for “metabolomics” or “metabonomics” with mass spectrometry (MS) or nuclear magnetic resonance (NMR) spectroscopy showed that the pioneering applications of chemometrics to NMR analysis of biological samples resulted in a rapid increase in MS-based studies.
Figure 2
Figure 2
Gap between analytical need and current capabilities for metabolomics analysis of human samples. The human metabolome is estimated to contain 1–3 million chemicals. Most targeted liquid chromatography and gas chromatography based mass spectrometry methods detect 300–700 metabolites, underscoring the substantial need for improved methods to test for chemical exposures associated with human disease. Analytical coverage is improved by probabilitybased methods providing moderate to high confidence scores for annotations of more than 2000 metabolites. Advanced computational methods facilitate the detection of more than 35,000 ions, and feasibility studies show the detection of 250,000 to 800,000 ions is possible.
Figure 3
Figure 3
High-resolution metabolomics data processing. Similar data processing procedures are used for peak picking and alignment. In xMSanalyzer, which is illustrated here, step one involves noise removal, peak detection, integration, and alignment at multiple parameter settings. In step two, feature and sample quality assessment are performed at each parameter combination. Next, an optimization procedure is performed by merging and evaluating results from different parameter settings to improve data quality and detection coverage as data extraction using only one setting could give suboptimal results. The merged results are then used for additional quality assessment and correction such as evaluation of internal standards and reference metabolites, mass calibration, and batch-effect correction in step 4. Step 5 involves m/z based annotation of features using HMDB, KEGG, T3DB, and LipidMaps.
Figure 4
Figure 4
Correlation-based network analysis to identify related ions and metabolites. Data-driven network analysis can be used to identify modules/clusters of strongly associated ions. Some of these associations are a consequence of analytical correlations, such as multiple adducts formed from a single chemical, while other associations are a consequence of biological relationships. In the example shown here for the anesthetic ketamine, each subcluster shows strong associations between the primary form, adducts, isotopes, and ionization fragments derived from the same metabolite. Secondary correlations exist between biologically related metabolites, ketamine and its metabolites, norketamine, and hydroxyketamine. Data are from the studies of Jones et al. and Uppal et al.,
Figure 5
Figure 5
Metabolome-wide association study for metabolite identification. Choline correlation in different species illustrates preservation of metabolic association structures, supporting metabolite annotation. The network structures for humans and the common marmoset contain metabolites exhibiting similarly significant correlations with choline. Like correlations of adducts formed from a chemical during ionization, the existence of network correlations of metabolites in biological systems provides a parameter for establishing confidence in identification, even for low abundance metabolites without quality MS/MS spectra. The figure was reproduced with permission from ref . Figure as originally published in Uppal K., Soltow Q. A., Promislow D. E. L., Wachtman L. M., Quyyumi A. A. and Jones D. P. (2015) MetabNet: an R package for metabolic association analysis of high-resolution metabolomics data. Front. Bioeng. Biotechnol. 3:87. DOI: 10.3389/fbioe.2015.00087.
Figure 6
Figure 6
Computational identity prediction. (A) Distribution of metabolic features in a human data set with or without database matches in HMDB using common adduct forms showed that more than half of the ions reproducibly detected in human plasma did not have matches to known metabolites in HMDB. (B) Evaluation of results for medium-to-high confidence matches from a healthy human data set using a clustering approach based on correlation between ions across all samples, retention time, mass defect, adducts and isotopes pattern using MS/MS showed that 80% of matches are correct. Thus, methods are improving for the identification of high abundance metabolites, with moderate to high confidence annotation for over 2000 chemical species. Despite the ability to characterize such a large number of metabolites, a much larger number of ions are without matches in databases, creating a major challenge for biological interpretation. Methods are needed to provide unambiguous designation of these ions to facilitate identification, especially for unidentified ions linked to human disease (e.g., see Table 1).
Figure 7
Figure 7
Ion definition in multivector space. Assembling the million metabolome will require an unambiguous system for defining detected, unidentified ions. In this framework, experimental measures including high-accuracy mass-to-charge ratio (m/z), retention time relative to landmark chemicals, correlation structure, ion dissociation spectra, collision cross-section (CCS) from ion mobility spectrometry, and enantiomer selective detection are combined to uniquely position an ion in chemical space. This is arbitrarily visualized here in terms of three-dimensional plots; expression could be made in terms of six or more one-dimensional vectors from a common origin. In this figure, three dimensions are designated in a way that leverages the capabilities of currently available analytical and computational approaches while enabling incorporation of future advances. The dimensions of Plot 1 on the left includes untargeted profiling on high-resolution, accurate mass (HRAM) mass spectrometers coupled with chromatographic separation prior to detection. The use of landmark chemicals provides retention time indices for relative elution and metabolic correlation structure, which is anchored against the accurate m/z. Plot 2 in the middle is largely defined by structural characteristics of the molecule, which are designated by ion dissociation of precursor m/z from Plot 1 and CCS. Plot 3 on the right is defined by relative quantification of enantiomers. Several chiral methods are available but will require development for automated use in ion characterization.
Figure 8
Figure 8
Separation of isobaric environmental chemicals by ion mobility spectroscopy (IMS)-mass spectroscopy. Organic aerosol species constitute a major fraction of airborne particles contributing to air pollution and adversely impacting health of humans and other species. The complex mixtures of organic aerosol species are difficult to resolve by conventional analytical methods, and little information is available concerning the levels or distribution of these chemicals in humans and other mammalian species. This figure from a recent application of IMS-MS to samples from the Southern Oxidant and Aerosol Study (SOAS) shows the utility of IMS-MS for this challenging environmental issue. IMS-MS was performed for hydroxysulfate esters (HSE; C5H11O7S) of isoprene epoxydiols (IEPOX) in four different aerosol filter samples. Dashed vertical lines designate signals for three different IMS peaks of isoprene epoxydiols (IEPOX) after conversion to respective hydroxysulfate esters. Different stereoisomers of IEPOX are formed by radical reactions from isoprene hydroxyhydroperoxide intermediates. The stereoisomers are sufficiently resolved to allow discrimination of the different species. The bars on the top denote the uncertainty in the drift time dimension for each peak and were determined from the standard error of the mean of a mobility calibration compound from its average drift time. Additional details are provided in the original publication (Figure 4) by Krechmer et al. This figure was reproduced with permission granted by the original authors and Creative Commons Attribution 3.0 License.
Figure 9
Figure 9
Developmental need exists for enantiomer-selective designation. Many environmental chemicals exist as stereoisomers, and this presents a challenge for chromatography and detection methods which do not resolve stereoisomers. Analytical data for S- and R-enantiomers of L-methionine sulfoxide illustrate the need for enantiomer-selective designation of ions. (A) Anion exchange (AE) chromatography was unable to separate enantiomers of L-methionine-sulfoxide prior to detection, resulting in one peak representing the sum of the two enantiomers. Use of a chiral column that resulted in specific R- and S-interactions with the two enantiomers separate L-methionine(S)sulfoxide from L-methionine(R)sulfoxide, enabling quantification of each. (B,C) Ion dissociation (MS) of the two enantiomers showed identical fragmentation patterns and are indistinguishable when defined by accurate mass, retention time, and MS spectra. Thus, there is a need to develop methods to enable enantiomer-specific designation for ions in the million metabolome. Available analytical methods include chiral selectors, ion mobility with chiral gases, and chromatographic separation using enantiomer specific retention mechanisms.

Similar articles

See all similar articles

Cited by 52 articles

See all "Cited by" articles

Publication types

Feedback