Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation

Anal Chem. 2006 Jan 15;78(2):567-74. doi: 10.1021/ac051495j.


A large metabolomics study was performed on 600 plasma samples taken at four time points before and after a single intake of a high fat test meal by obese and lean subjects. All samples were analyzed by a liquid chromatography-mass spectrometry (LC-MS) lipidomic method for metabolic profiling. A pragmatic approach combining several well-established statistical methods was developed for processing this large data set in order to detect small differences in metabolic profiles in combination with a large biological variation. Such metabolomics studies require a careful analytical and statistical protocol. The strategy included data preprocessing, data analysis, and validation of statistical models. After several data preprocessing steps, partial least-squares discriminant analysis (PLS-DA) was used for finding biomarkers. To validate the found biomarkers statistically, the PLS-DA models were validated by means of a permutation test, biomarker models, and noninformative models. Univariate plots of potential biomarkers were used to obtain insight in up- or downregulation. The strategy proposed proved to be applicable for dealing with large-scale human metabolomics studies.

Publication types

  • Multicenter Study

MeSH terms

  • Chromatography, Liquid
  • Data Interpretation, Statistical*
  • Dietary Fats / administration & dosage*
  • Dietary Fats / blood
  • Europe
  • Humans
  • Least-Squares Analysis*
  • Lipids / blood*
  • Mass Spectrometry
  • Obesity / blood*
  • Postprandial Period / physiology


  • Dietary Fats
  • Lipids