Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 28;6:28625.
doi: 10.1038/srep28625.

Characterization, Correction and De Novo Assembly of an Oxford Nanopore Genomic Dataset From Agrobacterium Tumefaciens

Affiliations
Free PMC article

Characterization, Correction and De Novo Assembly of an Oxford Nanopore Genomic Dataset From Agrobacterium Tumefaciens

Stéphane Deschamps et al. Sci Rep. .
Free PMC article

Abstract

The MinION is a portable single-molecule DNA sequencing instrument that was released by Oxford Nanopore Technologies in 2014, producing long sequencing reads by measuring changes in ionic flow when single-stranded DNA molecules translocate through the pores. While MinION long reads have an error rate substantially higher than the ones produced by short-read sequencing technologies, they can generate de novo assemblies of microbial genomes, after an initial correction step that includes alignment of Illumina sequencing data or detection of overlaps between Oxford Nanopore reads to improve accuracy. In this study, MinION reads were generated from the multi-chromosome genome of Agrobacterium tumefaciens strain LBA4404. Errors in the consensus two-directional (sense and antisense) "2D" sequences were first characterized by way of comparison with an internal reference assembly. Both Illumina-based correction and self-correction were performed and the resulting corrected reads assembled into high-quality hybrid and non-hybrid assemblies. Corrected read datasets and assemblies were subsequently compared. The results shown here indicate that both hybrid and non-hybrid methods can be used to assemble Oxford Nanopore reads into informative multi-chromosome assemblies, each with slightly different outcomes in terms of contiguity and accuracy.

Conflict of interest statement

S.D. is part of the MinION early access programme (MAP) and has received free sequencing reagents and Flow Cells from Oxford Nanopore Technologies.

Figures

Figure 1
Figure 1. MinION sequence accuracy histogram.
X-axis: percentage identity of the portion of the 2D reads aligning to the reference assembly. Y-axis: raw counts of MinION 2D reads aligning to the reference assembly and clustered with a given identity percentage point. Percentages were determined after alignment of raw 2D reads to the reference assembly using BWA –MEM and automated retrieval of percentage identity from the BWA –MEM output.
Figure 2
Figure 2. MinION K-mer retrieval.
X-axis: counts of each individual 5-mers in the MinION 2D read dataset; Y-axis: counts of each individual 5-mers in the reference assembly.
Figure 3
Figure 3. MinION sequence distribution in relation to G+C content.
Sequencing coverages of the reference assembly by individual 2D reads were plotted for all four components of the Agrobacterium tumefaciens strain LBA4404 genome against the G+C content of the same individual’s 2D reads. G+C content for 2D reads were determined incrementally using a 1-Kb window size.
Figure 4
Figure 4. Venn diagram comparing MinION 2D read correction processes.
Four of the five corrected datasets were compared, including Illumina-corrected reads generated with PBcR (yellow) and nanocorr (green), and self-corrected reads generated with PBcR (purple) and canu (blue). Numbers show the number of 2D corrected reads located at the intersection of 2 or more datasets, or unique to a particular dataset.
Figure 5
Figure 5. BLASTN comparisons of MinION assemblies to the reference assembly.
(Top) Contigs from various MinION 2D read assemblies are shown in red; (Bottom) Contigs from the reference assembly are shown in blue (listed in the following order, from left to right: circular chromosome, linear chromosome, At plasmid, Ti plasmid). Ribbon plots are shown where assemblies were aligned and compared to the reference assembly using BLASTN. (a) canu non-hybrid assembly; (b) PBcR non-hybrid assembly; (c) PBcR hybrid assembly; (d) canu hybrid assembly; (e) SPAdes hybrid assembly. Scale is shown in Mbps. The discontinued homology to the circular chromosome shown in (a) likely is due to a “wraparound” effect due to the circular nature of the chromosome.

Similar articles

See all similar articles

Cited by 17 articles

See all "Cited by" articles

References

    1. Mardis E. R. A decade’s perspective on DNA sequencing technology. Nature 470, 198–203 (2011). - PubMed
    1. Shendure J. & Ji H. Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008). - PubMed
    1. Margulies M. et al. . Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005). - PMC - PubMed
    1. Bentley D. R. et al. . Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008). - PMC - PubMed
    1. Deschamps S. & Campbell M. A. Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery. Mol. Breed. 25, 553–570 (2010).

MeSH terms

Feedback