Ironing out the wrinkles in the rare biosphere through improved OTU clustering

Susan M Huse; David Mark Welch; Hilary G Morrison; Mitchell L Sogin

doi:10.1111/j.1462-2920.2010.02193.x

Ironing out the wrinkles in the rare biosphere through improved OTU clustering

Environ Microbiol. 2010 Jul;12(7):1889-98. doi: 10.1111/j.1462-2920.2010.02193.x. Epub 2010 Mar 11.

Authors

Susan M Huse¹, David Mark Welch, Hilary G Morrison, Mitchell L Sogin

Affiliation

¹ Josephine Bay Paul Center, Marine Biological Laboratory at Woods Hole, 7 MBL Street, Woods Hole, MA 02543, USA.

Abstract

Deep sequencing of PCR amplicon libraries facilitates the detection of low-abundance populations in environmental DNA surveys of complex microbial communities. At the same time, deep sequencing can lead to overestimates of microbial diversity through the generation of low-frequency, error-prone reads. Even with sequencing error rates below 0.005 per nucleotide position, the common method of generating operational taxonomic units (OTUs) by multiple sequence alignment and complete-linkage clustering significantly increases the number of predicted OTUs and inflates richness estimates. We show that a 2% single-linkage preclustering methodology followed by an average-linkage clustering based on pairwise alignments more accurately predicts expected OTUs in both single and pooled template preparations of known taxonomic composition. This new clustering method can reduce the OTU richness in environmental samples by as much as 30-60% but does not reduce the fraction of OTUs in long-tailed rank abundance curves that defines the rare biosphere.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Biodiversity*
Cluster Analysis*
DNA, Bacterial / chemistry
DNA, Bacterial / genetics
DNA, Ribosomal / chemistry
DNA, Ribosomal / genetics
Environmental Microbiology*
Escherichia coli / genetics
Metagenomics / methods*
RNA, Ribosomal, 16S / genetics
Sequence Alignment / methods*
Sequence Analysis, DNA
Staphylococcus epidermidis / genetics

Substances

DNA, Bacterial
DNA, Ribosomal
RNA, Ribosomal, 16S

Abstract

Publication types

MeSH terms

Substances

Grants and funding