Data-driven strategy for the discovery of potential urinary biomarkers of habitual dietary exposure

Am J Clin Nutr. 2013 Feb;97(2):377-89. doi: 10.3945/ajcn.112.048033. Epub 2012 Dec 26.


Background: An understanding of causal relations between diet and health is hindered by the lack of robust biological markers of food exposure.

Objective: We aimed to develop a data-driven procedure to discover urine biomarkers indicative of habitual exposure to different foods.

Design: The habitual diet of 68 participants was assessed by using 4 food-frequency questionnaires over 3 mo, and participants were assigned to different consumption-frequency classes for 58 dietary components. Flow infusion electrospray-ionization mass spectrometry followed by supervised multivariate data analysis was used to determine whether the chemical composition of urine was related to specific differences in the consumption levels of each food.

Results: Foods were eaten habitually in 1 of 5 basic patterns differing in range and distribution of consumption frequency. Overnight, 24-h, and fasting urine samples proved useful for biomarker lead discovery with habitual citrus exposure used as a paradigm. Exposure level discrimination robustness improved linearly as urine samples from low-frequency citrus consumers were compared with urine samples from participants reporting increasingly higher intakes. For all foods, distinctiveness and consumption-frequency range influenced the likelihood that differential dietary exposure could be detected. Model output statistics indicated foods for which biomarker lead discovery was feasible. Metabolites proposed previously as acute intake biomarkers of citrus (proline betaine), oily fish (methylhistidine), coffee (dihydrocaffeic acid derivatives), and tomato (phenolic metabolites) were also biomarkers of habitual exposure. A significance threshold in modeling output statistics was determined to guide the discovery of potential biomarkers for other foods.

Conclusion: This data-driven strategy can identify urinary metabolites associated with habitual exposure to specific foods. This trial has the UK registration number 4349 and was registered at as CCT-NAPN-A13175.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers / urine
  • Citrus / chemistry*
  • Diet*
  • Feasibility Studies
  • Feeding Behavior*
  • Female
  • Fruit / chemistry*
  • Humans
  • Male
  • Middle Aged
  • Models, Biological*
  • Multivariate Analysis
  • Proline / analogs & derivatives
  • Proline / urine
  • Spectrometry, Mass, Electrospray Ionization
  • Surveys and Questionnaires


  • Biomarkers
  • Proline
  • stachydrine