A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data

Biostatistics. 2021 Jul 17;22(3):522-540. doi: 10.1093/biostatistics/kxz050.

Abstract

Microbiome omics approaches can reveal intriguing relationships between the human microbiome and certain disease states. Along with identification of specific bacteria taxa associated with diseases, recent scientific advancements provide mounting evidence that metabolism, genetics, and environmental factors can all modulate these microbial effects. However, the current methods for integrating microbiome data and other covariates are severely lacking. Hence, we present an integrative Bayesian zero-inflated negative binomial regression model that can both distinguish differentially abundant taxa with distinct phenotypes and quantify covariate-taxa effects. Our model demonstrates good performance using simulated data. Furthermore, we successfully integrated microbiome taxonomies and metabolomics in two real microbiome datasets to provide biologically interpretable findings. In all, we proposed a novel integrative Bayesian regression model that features bacterial differential abundance analysis and microbiome-covariate effects quantifications, which makes it suitable for general microbiome studies.

Keywords: Bayesian regression; Count data; Feature selection; Integrative analysis; Microbiome.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Bacteria
  • Bayes Theorem
  • Humans
  • Microbiota*
  • Models, Statistical