Abstract
We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.
Publication types
-
Research Support, N.I.H., Extramural
-
Research Support, Non-U.S. Gov't
-
Research Support, U.S. Gov't, Non-P.H.S.
MeSH terms
-
Animals
-
Cohort Studies
-
Computational Biology / methods*
-
DNA, Bacterial / genetics
-
False Positive Reactions
-
Feces / microbiology
-
Female
-
High-Throughput Nucleotide Sequencing / methods*
-
Humans
-
Lactobacillus / classification
-
Lactobacillus / genetics
-
Lactobacillus / isolation & purification*
-
Mice
-
Microbiota / genetics*
-
Pregnancy
-
RNA, Ribosomal, 16S / genetics
-
Reproducibility of Results
-
Sequence Analysis, DNA / methods*
-
Software*
-
Vagina / microbiology
Substances
-
DNA, Bacterial
-
RNA, Ribosomal, 16S