Computational mass spectrometry for metabolomics: identification of metabolites and small molecules

Anal Bioanal Chem. 2010 Dec;398(7-8):2779-88. doi: 10.1007/s00216-010-4142-5. Epub 2010 Oct 9.


The identification of compounds from mass spectrometry (MS) data is still seen as a major bottleneck in the interpretation of MS data. This is particularly the case for the identification of small compounds such as metabolites, where until recently little progress has been made. Here we review the available approaches to annotation and identification of chemical compounds based on electrospray ionization (ESI-MS) data. The methods are not limited to metabolomics applications, but are applicable to any small compounds amenable to MS analysis. Starting with the definition of identification, we focus on the analysis of tandem mass and MS(n) spectra, which can provide a wealth of structural information. Searching in libraries of reference spectra provides the most reliable source of identification, especially if measured on comparable instruments. We review several choices for the distance functions. The identification without reference spectra is even more challenging, because it requires approaches to interpret tandem mass spectra with regard to the molecular structure. Both commercial and free tools are capable of mining general-purpose compound libraries, and identifying candidate compounds. The holy grail of computational mass spectrometry is the de novo deduction of structure hypotheses for compounds, where method development has only started thus far. In a case study, we apply several of the available methods to the three compounds, kaempferol, reserpine, and verapamil, and investigate whether this results in reliable identifications.

Publication types

  • Review

MeSH terms

  • Computational Biology / methods*
  • Humans
  • Metabolomics / methods*
  • Spectrometry, Mass, Electrospray Ionization / methods*
  • Tandem Mass Spectrometry / methods*