Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics

Anal Chem. 2017 Mar 21;89(6):3517-3523. doi: 10.1021/acs.analchem.6b04719. Epub 2017 Mar 7.


As mass spectrometry-based metabolomics becomes more widely used in biomedical research, it is important to revisit existing data analysis paradigms. Existing data preprocessing efforts have largely focused on methods which start by extracting features separately from each sample, followed by a subsequent attempt to group features across samples to facilitate comparisons. We show that this preprocessing approach leads to unnecessary variability in peak quantifications that adversely impacts downstream analysis. We present a new method, bakedpi, for the preprocessing of both centroid and profile mode metabolomics data that relies on an intensity-weighted bivariate kernel density estimation on a pooling of all samples to detect peaks. This new method reduces this unnecessary quantification variability and increases power in downstream differential analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Androgens / blood
  • Androgens / metabolism*
  • Animals
  • Arabidopsis / chemistry
  • Arabidopsis / metabolism
  • Cell Line
  • Female
  • Humans
  • Hyperinsulinism / blood
  • Hyperinsulinism / metabolism*
  • Infant
  • Liver / chemistry
  • Liver / metabolism
  • MCF-7 Cells
  • Mass Spectrometry
  • Metabolomics*
  • Mice
  • Plant Leaves / chemistry
  • Plant Leaves / metabolism
  • Resveratrol / analysis
  • Resveratrol / metabolism*


  • Androgens
  • Resveratrol