Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov 28;9:507.
doi: 10.1186/1471-2105-9-507.

MetaboMiner--semi-automated Identification of Metabolites From 2D NMR Spectra of Complex Biofluids

Affiliations
Free PMC article

MetaboMiner--semi-automated Identification of Metabolites From 2D NMR Spectra of Complex Biofluids

Jianguo Xia et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: One-dimensional (1D) 1H nuclear magnetic resonance (NMR) spectroscopy is widely used in metabolomic studies involving biofluids and tissue extracts. There are several software packages that support compound identification and quantification via 1D 1H NMR by spectral fitting techniques. Because 1D 1H NMR spectra are characterized by extensive peak overlap or spectral congestion, two-dimensional (2D) NMR, with its increased spectral resolution, could potentially improve and even automate compound identification or quantification. However, the lack of dedicated software for this purpose significantly restricts the application of 2D NMR methods to most metabolomic studies.

Results: We describe a standalone graphics software tool, called MetaboMiner, which can be used to automatically or semi-automatically identify metabolites in complex biofluids from 2D NMR spectra. MetaboMiner is able to handle both 1H-1H total correlation spectroscopy (TOCSY) and 1H-13C heteronuclear single quantum correlation (HSQC) data. It identifies compounds by comparing 2D spectral patterns in the NMR spectrum of the biofluid mixture with specially constructed libraries containing reference spectra of approximately 500 pure compounds. Tests using a variety of synthetic and real spectra of compound mixtures showed that MetaboMiner is able to identify >80% of detectable metabolites from good quality NMR spectra.

Conclusion: MetaboMiner is a freely available, easy-to-use, NMR-based metabolomics tool that facilitates automatic peak processing, rapid compound identification, and facile spectrum annotation from either 2D TOCSY or HSQC spectra. Using comprehensive reference libraries coupled with robust algorithms for peak matching and compound identification, the program greatly simplifies the process of metabolite identification in complex 2D NMR spectra.

Figures

Figure 1
Figure 1
An illustration of the calculation of uniqueness values. The red peak represents the peak of interest and three peaks are in its immediate vicinity. The calculations are only performed at five chemical shifts distance levels – 0.01, 0.02, 0.03, 0.04, 0.05 ppm along the 1H dimension, and 0.05, 0.10, 0.15, 0.20, 0.25 ppm along the 13C dimension. No peak is observed in the first three distance levels. So the maximum unique scope for this peak is (0.03, 0.15) ppm. Peak A is found within 0.03~0.04 ppm (1H dimension) and 0.15~0.20 ppm (13C dimension) of the red peak; Peak B is found within 0.04~0.05 ppm (1H dimension) and 0.20~0.25 ppm (13C dimension) of the red peak; Peak C is not considered since the chemical shift distance is more than 0.05 ppm along the 1H dimension. Therefore, the assigned uniqueness values are 0-0-0-1-2. Note that the distance is not drawn to scale.
Figure 2
Figure 2
MetaboMiner flowchart. The query peaks obtained from an automatic peak-picking program are first processed to remove streaks and other artefacts. The cleaned peak list is then scanned for the presence of peak patterns of compounds in a spectral reference library corresponding to the biofluid that has been identified by the user. Spectral images can be used to further refine the search result.
Figure 3
Figure 3
Screenshot of MetaboMiner's "Search View". The left panel shows the library compounds that have matches in the query peaks. The selected checkbox indicates the corresponding compound is considered to be present by MetaboMiner. 'R' or 'F' indicates whether the compound is identified during the reverse search or forward search, respectively. On the right panel, the reference peaks (in red) of the current selected compound is displayed with query peaks as background. The color variations represent the peak intensities with the dark green corresponding to the strongest peak intensities. When the mouse is placed over any synthetic peak, all its information (name, position, uniqueness values, etc.) will be displayed on the view panel. Right clicking on any peak will allow users to search the spectral library for this particular peak.
Figure 4
Figure 4
Screenshot of MetaboMiner's "Annotation View". The contents of the reference spectral library and the automatically identified compound list are shown on the left panel. The spectral image is displayed on the right panel. The red peaks correspond to the reference spectra of the current compound being annotated (Valine). Peak searching is carried out by right clicking on a corresponding Valine peak. The user can also directly edit the current compound by inserting, removing, or dragging its peaks to match the exact pattern of the reference spectrum.
Figure 5
Figure 5
Comparative performance of different search strategies. Synthetic mixture query spectra were generated by pooling the peaks of 50 randomly selected compounds from MetaboMiner's reference spectral library. Different levels of spectral noise were added to these peaks and then compounds were identified with (*) and without using the adaptive threshold method. The Figure 5A, the query peaks were deleted at random with 0%, 10%, 20%, 30%, 40% and 50% probabilities; Figure 5B, the query peaks were subject to five levels of random chemical shift variations (± 0.01, ± 0.02, ± 0.03, ± 0.04, ± 0.05 ppm for each 1H chemical shift, and ± 0.05, ± 0.10, ± 0.15, ± 0.20, ± 0.25 ppm for each 13C chemical shift). The F scores were averaged over 50 iterations. (Abbreviations: PM, percentage match method; MS, minimal signature method).
Figure 6
Figure 6
Evaluation of MetaboMiner using simulated datasets. Synthetic mixture query spectra were generated by pooling peaks from 20, 30, 40, 50, 60, 70, and 80 compounds randomly selected from MetaboMiner's spectral library. Spectral noise was introduced via random (10%) peak deletion and random chemical shift changes within ± 0.01 ppm for each 1H chemical shift, and within ± 0.05 ppm for each 13C chemical shift. Compound identification was based on minimal signatures using the adaptive threshold method. The F-measures were averaged over 50 iterations.

Similar articles

See all similar articles

Cited by 43 articles

See all "Cited by" articles

References

    1. Fiehn O. Metabolomics – the link between genotypes and phenotypes. Plant Mol Biol. 2002;48:155–171. doi: 10.1023/A:1013713905833. - DOI - PubMed
    1. Nicholson JK, Connelly J, Lindon JC, Holmes E. Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov. 2002;1:153–161. doi: 10.1038/nrd728. - DOI - PubMed
    1. Goodacre R, Vaidyanathan S, Dunn WB, Harrigan GG, Kell DB. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 2004;22:245–252. doi: 10.1016/j.tibtech.2004.03.007. - DOI - PubMed
    1. Nicholson JK, Lindon JC, Holmes E. 'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica. 1999;29:1181–1189. doi: 10.1080/004982599238047. - DOI - PubMed
    1. Crockford DJ, Keun HC, Smith LM, Holmes E, Nicholson JK. Curve-fitting method for direct quantitation of compounds in complex biological mixtures using 1H NMR: application in metabonomic toxicology studies. Anal Chem. 2005;77:4556–4562. doi: 10.1021/ac0503456. - DOI - PubMed

Publication types

Substances

LinkOut - more resources

Feedback