Enhanced metabolite annotation via dynamic retention time prediction: Steroidogenesis alterations as a case study

J Chromatogr B Analyt Technol Biomed Life Sci. 2017 Dec 15:1071:11-18. doi: 10.1016/j.jchromb.2017.04.032. Epub 2017 Apr 23.


The development of metabolomics based on ultra-high pressure liquid chromatography coupled to high-resolution mass spectrometry (UHPLC-HRMS) now allows hundreds to thousands of metabolites to be simultaneously monitored in biological matrices. In that context, bioinformatics and multivariate data analysis (MVA) play a crucial role in the detection of relevant alteration patterns. However, sound biological interpretations must necessarily be supported by metabolite identifications to be definitive or at least have a high degree of confidence. Each compound, should be characterised by unique molecular properties. Among them, the exact mass and the chromatographic retention time are recognised as major and complementary criteria for compound identification. While the former is easily derived from the molecular structure, building generic and accurate retention time open databases still constitutes a critical issue because of the vast diversity of instruments, stationary phases and operating conditions in UHPLC-HRMS. Because several hits matching a molecular formula obtained from an exact mass and an isotopic pattern are often generated for each analyte, this methodology rarely allows a unique and unambiguous molecular identity to be gained. This work aims to provide a flexible solution to facilitate reliable compound annotation based on retention time in reversed-phase liquid chromatography (RPLC). It proposes an innovative approach based on the chromatographic linear solvent strength (LSS) theory, allowing retention times under any gradient conditions at fixed temperature, stationary phase and mobile phase type to be predicted. Starting from a subset of the Human Metabolite Database (HMDB), a new dynamic database involving LSS parameters was developed. A real case study involving steroidogenesis alterations due to forskolin exposure was conducted using the adrenal H295R OECD reference cell model for endocrine disruptor screening. The prediction of retention times was successfully achieved, facilitating steroid identification. An automated procedure which implements the compound annotation levels encouraged by the Metabolite Standard Initiative (MSI) and the Coordination of Standards in Metabolomics (COSMOS) was also developed to speed up the process and enhance the data reusability.

Keywords: Dynamic retention time prediction; H295R cell line; Linear solvent strength theory; Metabolite annotation; Metabolomics; Steroidogenesis.

MeSH terms

  • Cell Line
  • Chromatography, High Pressure Liquid
  • Chromatography, Reverse-Phase
  • Computational Biology / methods*
  • Data Curation / methods*
  • Databases, Factual
  • Humans
  • Mass Spectrometry
  • Metabolomics / methods*
  • Models, Theoretical
  • Steroids / metabolism*


  • Steroids