Machine learning methods for microbiome studies

J Microbiol. 2020 Mar;58(3):206-216. doi: 10.1007/s12275-020-0066-8. Epub 2020 Feb 27.

Abstract

Researches on the microbiome have been actively conducted worldwide and the results have shown human gut bacterial environment significantly impacts on immune system, psychological conditions, cancers, obesity, and metabolic diseases. Thanks to the development of sequencing technology, microbiome studies with large number of samples are eligible on an acceptable cost nowadays. Large samples allow analysis of more sophisticated modeling using machine learning approaches to study relationships between microbiome and various traits. This article provides an overview of machine learning methods for non-data scientists interested in the association analysis of microbiomes and host phenotypes. Once genomic feature of microbiome is determined, various analysis methods can be used to explore the relationship between microbiome and host phenotypes that include penalized regression, support vector machine (SVM), random forest, and artificial neural network (ANN). Deep neural network methods are also touched. Analysis procedure from environment setup to extract analysis results are presented with Python programming language.

Keywords: deep learning; machine learning; microbiome; semi-supervised; supervised; unsupervised.

MeSH terms

  • Bacteria* / classification
  • Bacteria* / genetics
  • Genomics / methods
  • Humans
  • Machine Learning*
  • Microbiota / genetics*