Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 17;8(6):e66643.
doi: 10.1371/journal.pone.0066643. Print 2013.

A filtering method to generate high quality short reads using illumina paired-end technology

Affiliations

A filtering method to generate high quality short reads using illumina paired-end technology

A Murat Eren et al. PLoS One. .

Erratum in

  • PLoS One. 2013;8(6). doi:10.1371/annotation/afa5c40d-c604-46ae-84c4-82cb92193a5e

Abstract

Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Structure of V6 fusion primers used to generate amplicon libraries for Illumina sequencing.
Figure 2
Figure 2. Comparison of three filtering methods.
The top panel shows the ratio of pairs identified as low quality versus all pairs analyzed for each method. The total number of pairs in each dataset is shown in the bottom panel.
Figure 3
Figure 3. Paired-end reads from 33 samples that passed and failed the quality filtering by individual methods are compared in Venn diagrams.
The mean quality scores of paired-end reads from the numbered regions in Venn diagrams are shown below. In each panel, the top and bottom lines show read 1 and read 2, respectively. The mean quality of each pair at each nucleotide position is also shown with a smooth line.
Figure 4
Figure 4. Comparison of the relative cluster counts and the absolute number of clusters identified in reads filtered by three different methods.
The top panel shows the relative number of 97% OTUs for each method, with the method that produced the largest number assigned a value of 1.0. The bottom panel presents the actual number of OTUs for each method.

Similar articles

Cited by

References

    1. Pedros-Alio C (2007) Ecology. Dipping into the rare biosphere. Science 315: 192–193. - PubMed
    1. Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, et al. (2006) Microbial diversity in the deep sea and the underexplored "rare biosphere". Proc Natl Acad Sci U S A 103: 12115–12120. - PMC - PubMed
    1. Huse SM, Welch DM, Morrison HG, Sogin ML (2010) Ironing out the wrinkles in the rare biosphere through improved OTU clustering. Environ Microbiol 12: 1889–1898. - PMC - PubMed
    1. Kunin V, Engelbrektson A, Ochman H, Hugenholtz P (2010) Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ Microbiol 12: 118–123. - PubMed
    1. Bokulich NA, Subramanian S, Faith JJ, Gevers D, Gordon JI, et al. (2013) Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat Methods 10: 57–59. - PMC - PubMed

Publication types

LinkOut - more resources