Correlation detection strategies in microbial data sets vary widely in sensitivity and precision

ISME J. 2016 Jul;10(7):1669-81. doi: 10.1038/ismej.2015.235. Epub 2016 Feb 23.

Abstract

Disruption of healthy microbial communities has been linked to numerous diseases, yet microbial interactions are little understood. This is due in part to the large number of bacteria, and the much larger number of interactions (easily in the millions), making experimental investigation very difficult at best and necessitating the nascent field of computational exploration through microbial correlation networks. We benchmark the performance of eight correlation techniques on simulated and real data in response to challenges specific to microbiome studies: fractional sampling of ribosomal RNA sequences, uneven sampling depths, rare microbes and a high proportion of zero counts. Also tested is the ability to distinguish signals from noise, and detect a range of ecological and time-series relationships. Finally, we provide specific recommendations for correlation technique usage. Although some methods perform better than others, there is still considerable need for improvement in current techniques.

MeSH terms

  • Bacteria / genetics*
  • Benchmarking / statistics & numerical data*
  • Computational Biology
  • Humans
  • Microbial Interactions*
  • Microbiota*
  • Models, Statistical
  • RNA, Ribosomal, 16S / genetics
  • Statistics as Topic

Substances

  • RNA, Ribosomal, 16S