A benchmark study of data normalisation methods for PTR-TOF-MS exhaled breath metabolomics

J Breath Res. 2023 Nov 10;18(1). doi: 10.1088/1752-7163/ad08ce.

Abstract

Volatilomics is the branch of metabolomics dedicated to the analysis of volatile organic compounds in exhaled breath for medical diagnostic or therapeutic monitoring purposes. Real-time mass spectrometry (MS) technologies such as proton transfer reaction (PTR) MS are commonly used, and data normalisation is an important step to discard unwanted variation from non-biological sources, as batch effects and loss of sensitivity over time may be observed. As normalisation methods for real-time breath analysis have been poorly investigated, we aimed to benchmark known metabolomic data normalisation methods and apply them to PTR-MS data analysis. We compared seven normalisation methods, five statistically based and two using multiple standard metabolites, on two datasets from clinical trials for COVID-19 diagnosis in patients from the emergency department or intensive care unit. We evaluated different means of feature selection to select the standard metabolites, as well as the use of multiple repeat measurements of ambient air to train the normalisation methods. We show that the normalisation tools can correct for time-dependent drift. The methods that provided the best corrections for both cohorts were probabilistic quotient normalisation and normalisation using optimal selection of multiple internal standards. Normalisation also improved the diagnostic performance of the machine learning models, significantly increasing sensitivity, specificity and area under the receiver operating characteristic (ROC) curve for the diagnosis of COVID-19. Our results highlight the importance of adding an appropriate normalisation step during the processing of PTR-MS data, which allows significant improvements in the predictive performance of statistical models.Clinical trials: VOC-COVID-Diag (EudraCT 2020-A02682-37); RECORDS trial (EudraCT 2020-000296-21).

Keywords: PTR-TOF-MS; data normalisation; exhaled breath; machine learning.

MeSH terms

  • Benchmarking
  • Breath Tests / methods
  • COVID-19 Testing
  • COVID-19*
  • Humans
  • Mass Spectrometry / methods
  • Protons
  • Volatile Organic Compounds* / analysis

Substances

  • Protons
  • Volatile Organic Compounds