Data science for pediatric infectious disease: utilizing COVID-19 as a model

Curr Opin Infect Dis. 2025 Oct 1;38(5):493-498. doi: 10.1097/QCO.0000000000001139. Epub 2025 Aug 1.

Abstract

Purpose of review: During the COVID-19 pandemic, governments and public health agencies used data science tools and data sources in real time to evaluate pathogen transmissibility, disease burden, healthcare capacity, and evaluate treatment and preventive measures. The purpose of the review is to highlight the application of these data sources and methods during the COVID-19 response.

Recent findings: Advances in the development of common data models enabled multisite data networks to overcome healthcare data fragmentation, enabling national surveillance platforms, and offering unprecedented statistical power to conduct national surveillance and detect emerging clinical entities like MIS-C and long COVID in diverse pediatric populations. These integrated networks were also used in evaluating the effectiveness of vaccines and therapies. New surveillance approaches combining traditional clinical data with novel data sources including wastewater detection, web-based search engines, and mobility patterns yielded comprehensive ensemble approaches that informed public health policy.

Summary: The COVID-19 pandemic highlighted the importance of timely evidence for decision-making during outbreak responses and the benefits of using data science tools to help provide real time, actionable insights, which can help guide our public health response to infectious diseases threats in the future.

Keywords: COVID-19; computable phenotypes; data science; machine learning; pandemic surveillance.

Publication types

  • Review

MeSH terms

  • COVID-19* / epidemiology
  • COVID-19* / prevention & control
  • Child
  • Data Science* / methods
  • Humans
  • Pandemics
  • Pediatrics
  • Public Health
  • SARS-CoV-2