Decoding longitudinal microbiome trajectories: an interpretable machine learning approach for biomarker discovery and prediction

Brief Bioinform. 2025 Jul 2;26(4):bbaf374. doi: 10.1093/bib/bbaf374.

Abstract

Information generated from longitudinally sampled microbial data has the potential to illuminate important aspects of development and progression for many human conditions and diseases. Identifying microbial biomarkers and their time-varying effects can not only advance our understanding of pathogenetic mechanisms, but also facilitate early diagnosis and guide optimal timing of interventions. However, longitudinal predictive modeling of highly noisy and dynamic microbial data (e.g. metagenomics) poses analytical challenges.To overcome these challenges, we introduce a robust and interpretable machine-learning-based longitudinal microbiome analysis framework, LP-Micro, that encompasses (i) longitudinal microbial feature screening via a polynomial group lasso, (ii) disease outcome prediction implemented via machine learning methods (e.g. XGBoost, deep neural networks), and (iii) interpretable association testing between time points, microbial features, and disease outcomes via permutation feature importance. We demonstrate in simulations that LP-Micro can not only identify incident disease-related microbiome taxa, but also offers improved prediction accuracy compared with existing approaches. Applications of LP-Micro in two longitudinal microbiome studies with clinical outcomes of childhood dental disease and weight loss following bariatric surgery yield consistently high prediction accuracy. Moreover, LP-Micro highlights critical time points and associated microbial changes: oral microbial changes, including Streptococcus mutans, are most informative for predicting childhood dental disease at around 39 months of age, while gut microbial changes shortly after bariatric surgery strongly predict future weight loss. These findings are both informative and aligned with clinical expectations. The tool LP-Micro can be seen at https://github.com/IV012/LPMicro.

Keywords: biomarker discovery; early disease prediction; interpretable modeling; longitudinal microbiome; machine learning.

MeSH terms

  • Biomarkers*
  • Gastrointestinal Microbiome
  • Humans
  • Longitudinal Studies
  • Machine Learning*
  • Microbiota*

Substances

  • Biomarkers