Microbial genes outperform species and SNVs as diagnostic markers for Crohn's disease on multicohort fecal metagenomes empowered by artificial intelligence

Gut Microbes. 2023 Jan-Dec;15(1):2221428. doi: 10.1080/19490976.2023.2221428.


Dysbiosis of gut microbial community is associated with the pathogenesis of CD and may serve as a promising noninvasive diagnostic tool. We aimed to compare the performances of the microbial markers of different biological levels by conducting a multidimensional analysis on the microbial metagenomes of CD. We collected fecal metagenomic datasets generated from eight cohorts that altogether include 870 CD patients and 548 healthy controls. Microbial alterations in CD patients were assessed at multidimensional levels including species, gene, and SNV level, and then diagnostic models were constructed using artificial intelligence algorithm. A total of 227 species, 1047 microbial genes, and 21,877 microbial SNVs were identified that differed between CD and controls. The species, gene, and SNV models achieved an average AUC of 0.97, 0.95, and 0.77, respectively. Notably, the gene model exhibited superior diagnostic capability, achieving an average AUC of 0.89 and 0.91 for internal and external validations, respectively. Moreover, the gene model was specific for CD against other microbiome-related diseases. Furthermore, we found that phosphotransferase system (PTS) contributed substantially to the diagnostic capability of the gene model. The outstanding performance of PTS was mainly explained by genes celB and manY, which demonstrated high predictabilities for CD with metagenomic datasets and was validated in an independent cohort by qRT-PCR analysis. Our global metagenomic analysis unravels the multidimensional alterations of the microbial communities in CD and identifies microbial genes as robust diagnostic biomarkers across geographically and culturally distinct cohorts.

Keywords: Crohn’s disease; artificial intelligence; microbiome biomarkers; noninvasive diagnosis; phosphotransferase system.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Crohn Disease* / diagnosis
  • Crohn Disease* / genetics
  • Dysbiosis / diagnosis
  • Dysbiosis / genetics
  • Feces
  • Gastrointestinal Microbiome* / genetics
  • Genes, Microbial
  • Humans
  • Metagenome

Grants and funding

The work was supported by the National Natural Science Foundation of China [82170542, 92251307, 32200529, 82000536]; National Key Research and Development Program of China [2021YFF0703700/2021YFF0703702]; Guangdong Province “Pearl River Talent Plan” Innovation and Entrepreneurship Team Project [2019ZT08Y464]; Program of Guangdong Provincial Clinical Research Center for Digestive Diseases [2020B1111170004], and National Key Clinical Discipline.