Bioinformatics and machine learning in gastrointestinal microbiome research and clinical application

Prog Mol Biol Transl Sci. 2020;176:141-178. doi: 10.1016/bs.pmbts.2020.08.011. Epub 2020 Sep 30.


The scientific community currently defines the human microbiome as all the bacteria, viruses, fungi, archaea, and eukaryotes that occupy the human body. When considering the variable locations, composition, diversity, and abundance of our microbial symbionts, the sheer volume of microorganisms reaches hundreds of trillions. With the onset of next generation sequencing (NGS), also known as high-throughput sequencing (HTS) technologies, the barriers to studying the human microbiome lowered significantly, making in-depth microbiome research accessible. Certain locations on the human body, such as the gastrointestinal, oral, nasal, and skin microbiomes have been heavily studied through community-focused projects like the Human Microbiome Project (HMP). In particular, the gastrointestinal microbiome (GM) has received significant attention due to links to neurological, immunological, and metabolic diseases, as well as cancer. Though HTS technologies allow deeper exploration of the GM, data informing the functional characteristics of microbiota and resulting effects on human function or disease are still sparse. This void is compounded by microbiome variability observed among humans through factors like genetics, environment, diet, metabolic activity, and even exercise; making GM research inherently difficult to study. This chapter describes an interdisciplinary approach to GM research with the goal of mitigating the hindrances of translating findings into a clinical setting. By applying tools and knowledge from microbiology, metagenomics, bioinformatics, machine learning, predictive modeling, and clinical study data from children with treatment-resistant epilepsy, we describe a proof-of-concept approach to clinical translation and precision application of GM research.

Keywords: Bioinformatics; Diabetes; Epilepsy; Ketogenic diet; Machine learning; Metagenomics; Microbiome; Predictive modeling; Relative abundance; Taxonomic profiling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Gastrointestinal Microbiome*
  • Humans
  • Machine Learning
  • Metagenomics