An improved workflow for accurate and robust healthcare environmental surveillance using metagenomics

Microbiome. 2022 Dec 2;10(1):206. doi: 10.1186/s40168-022-01412-x.


Background: Effective surveillance of microbial communities in the healthcare environment is increasingly important in infection prevention. Metagenomics-based techniques are promising due to their untargeted nature but are currently challenged by several limitations: (1) they are not powerful enough to extract valid signals out of the background noise for low-biomass samples, (2) they do not distinguish between viable and nonviable organisms, and (3) they do not reveal the microbial load quantitatively. An additional practical challenge towards a robust pipeline is the inability to efficiently allocate sequencing resources a priori. Assessment of sequencing depth is generally practiced post hoc, if at all, for most microbiome studies, regardless of the sample type. This practice is inefficient at best, and at worst, poor sequencing depth jeopardizes the interpretation of study results. To address these challenges, we present a workflow for metagenomics-based environmental surveillance that is appropriate for low-biomass samples, distinguishes viability, is quantitative, and estimates sequencing resources.

Results: The workflow was developed using a representative microbiome sample, which was created by aggregating 120 surface swabs collected from a medical intensive care unit. Upon evaluating and optimizing techniques as well as developing new modules, we recommend best practices and introduce a well-structured workflow. We recommend adopting liquid-liquid extraction to improve DNA yield and only incorporating whole-cell filtration when the nonbacterial proportion is large. We suggest including propidium monoazide treatment coupled with internal standards and absolute abundance profiling for viability assessment and involving cultivation when demanding comprehensive profiling. We further recommend integrating internal standards for quantification and additionally qPCR when we expect poor taxonomic classification. We also introduce a machine learning-based model to predict required sequencing effort from accessible sample features. The model helps make full use of sequencing resources and achieve desired outcomes. Video Abstract CONCLUSIONS: This workflow will contribute to more accurate and robust environmental surveillance and infection prevention. Lessons gained from this study will also benefit the continuing development of methods in relevant fields.

Keywords: Environmental surveillance; Infection prevention; Low biomass; Machine learning; Metagenomics; Quantification; Sequencing depth prediction; Viability.

Publication types

  • Video-Audio Media
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Delivery of Health Care
  • Environmental Monitoring
  • Metagenomics*
  • Microbiota* / genetics
  • Workflow