Bacteria comprise the most diverse domain of life on Earth, where they occupy nearly every possible ecological niche and play key roles in biological and chemical processes. Studying the composition and ecology of bacterial ecosystems and understanding their function is of prime importance. High-throughput sequencing technologies enable nearly comprehensive descriptions of bacterial diversity through 16S ribosomal RNA gene amplicons. Analyses of these communities generally rely upon taxonomic assignments through reference databases, or clustering approaches using de facto sequence similarity thresholds to identify operational taxonomic units. However, these methods often fail to resolve ecologically meaningful differences between closely related organisms in complex microbial datasets.In this paper we describe oligotyping, a novel supervised computational method that allows researchers to investigate the diversity of closely related but distinct bacterial organisms in final operational taxonomic units identified in environmental datasets through 16S ribosomal RNA gene data by the canonical approaches.Our analysis of two datasets from two distinct environments demonstrates the capacity of oligotyping at discriminating distinct microbial populations of ecological importance.Oligotyping can resolve the distribution of closely related organisms across environments and unveil previously overlooked ecological patterns for microbial communities. The URL http://oligotyping.org offers an open-source software pipeline for oligotyping.
Keywords: 16S; OTU clustering; Shannon entropy; bacterial taxonomy; microbial diversity.