An efficient spectra processing method for metabolite identification from 1H-NMR metabolomics data

Anal Bioanal Chem. 2013 Jun;405(15):5049-61. doi: 10.1007/s00216-013-6852-y. Epub 2013 Mar 23.


The spectra processing step is crucial in metabolomics approaches, especially for proton NMR metabolomics profiling. During this step, noise reduction, baseline correction, peak alignment and reduction of the 1D (1)H-NMR spectral data are required in order to allow biological information to be highlighted through further statistical analyses. Above all, data reduction (binning or bucketing) strongly impacts subsequent statistical data analysis and potential biomarker discovery. Here, we propose an efficient spectra processing method which also provides helpful support for compound identification using a new data reduction algorithm that produces relevant variables, called buckets. These buckets are the result of the extraction of all relevant peaks contained in the complex mixture spectra, rid of any non-significant signal. Taking advantage of the concentration variability of each compound in a series of samples and based on significant correlations that link these buckets together into clusters, the method further proposes automatic assignment of metabolites by matching these clusters with the spectra of reference compounds from the Human Metabolome Database or a home-made database. This new method is applied to a set of simulated (1)H-NMR spectra to determine the effect of some processing parameters and, as a proof of concept, to a tomato (1)H-NMR dataset to test its ability to recover the fruit extract compositions. The implementation code for both clustering and matching steps is available upon request to the corresponding author.

MeSH terms

  • Biomarkers / analysis
  • Biomarkers / metabolism
  • Cluster Analysis
  • Computer Simulation
  • Gene Expression Regulation, Plant / physiology
  • Humans
  • Magnetic Resonance Spectroscopy / methods*
  • Metabolomics / methods*
  • Solanum lycopersicum / chemistry
  • Solanum lycopersicum / metabolism*


  • Biomarkers