Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 32 (8), 829-33

Decoding Long Nanopore Sequencing Reads of Natural DNA

Affiliations

Decoding Long Nanopore Sequencing Reads of Natural DNA

Andrew H Laszlo et al. Nat Biotechnol.

Abstract

Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

Figures

Figure 1
Figure 1
Experimental schematic and raw data. (a) Method of adapting dsDNA for nanopore sequencing. The first adaptor (orange) includes a cholesterol tail which inserts into the membrane, increasing DNA capture rates while, the long 5' single stranded overhang facilitates insertion into the pore. A second adaptor (green) enables re-reading of the pore using the DNAP's synthesis mode, . (b) The protein nanopore MspA is shown in blue, phi 29 DNAP in green and DNA in orange. An applied voltage across the bilayer drives an ion current through the pore and an amplifier measures the current. DNA bases within the constriction determine the ion current. Phi 29 DNAP steps DNA through the pore in single-nucleotide steps. (c–e) Raw data for a representative 3000-second time window. Ion current changes as DNA is fed through the pore in single-nucleotide steps. Panels d and e each show a 1% section of the preceding panel's data shaded in red.
Figure 2
Figure 2
A quadromer map predicts current levels for previously unmeasured DNA. (a) Current levels observed for all possible 4-nucleotide sequences (quadromers) measured in eight segments of a 256-nucleotide de Bruijn sequence. (b) The black trace shows a consensus based on 22 reads of phi X 174 DNA. This is compared to predicted current levels based on the de Bruijn quadromer values. Error bars are the variance of the measured quadromer values. We use a consensus to correct for insertion/deletion errors caused by the stochastic motion of the phi29 DNAP. (c) Absolute current difference between quadromer map and measured consensus for the ~100 level sequence shown in panel b using the de Bruijn quadromer map (blue) and the revised quadromer map (red). In most instances, the revised map improves the predictive ability of our map. The correlation coefficient between measured values and the de Bruijn quadromer values is 0.9905 (95% confidence bounds [0.9859–0.9936]). The correlation coefficient between measured values and the revised quadromer values is 0.9938 (95% confidence bounds [0.9908–0.9958]).
Figure 3
Figure 3
Raw data to alignment. (a) Raw data are processed using a level-finding algorithm (Supplementary Discussion) to identify transitions between levels in the current trace. A subsequent filter removes most repeated levels, which likely result from polymerase backsteps (indicated by `*'). (b) Extract the sequence of median current values of each level. (c) Align the current values to predicted values from the reference sequence using the quadromer map (Fig. 2a). Alignment is performed with a dynamic programming alignment algorithm similar to Needleman-Wunch alignment (Supplementary Discussion). In some locations, levels are skipped in the nanopore read either owing to motions of the DNAP or errors made by the level finding algorithm. In other places, backsteps result in multiple reads of the same level. We determine read boundaries from the first and last matched levels in the reference sequence. Read boundaries are indicated by the blue lines. The above alignment had an estimated 6.4× 10−15 probability of false alignment.
Figure 4
Figure 4
Alignments to reference sequence and hybrid reconstruction. (a) Coverage plot for 91 nanopore sequencing reads of bacteriophage phi X 174 genomic DNA. Left and right alignment bounds are indicated by the extent of the line for each read. Random attachment of the asymmetric adaptors results in reads of both sense and antisense strands. Reads below the black dashed line (events 1–38) are sense strands while reads above (events 40–92) are antisense strands. Most reads begin near the 5' end of the linearization cut site and proceed towards the 3' end as the phi29 DNAP unzips the double stranded DNA. (b) Sum total coverage for each region within the phi X 174 genome. This graph indicates the number of reads that cover any given section of the genome using the sense and antisense strands. (c) Hybrid assembly of Illumina sequencing reads using a single nanopore read (Supplementary Discussion). Thirty-eight Illumina reads (horizontal black lines) are aligned to a single 3,819 nt long nanopore read (blue trace; indicated by the red * in panel a). (d) Detail of shaded region in panel c. Six 100 bp Illumina reads are shown where they align to the nanopore read.

Comment in

Similar articles

See all similar articles

Cited by 114 PubMed Central articles

See all "Cited by" articles

References

    1. Shendure J, Lieberman Aiden E. The expanding scope of DNA sequencing. Nat Biotechnol. 2012;30(11):1084–1094. - PMC - PubMed
    1. McCarthy JJ, McLeod HL, Ginsburg GS. Genomic medicine: a decade of successes, challenges, and opportunities. Sci Transl Med. 2013;5(189):189sr184. - PubMed
    1. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–945. Anonymous. - PubMed
    1. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26(10):1135–1145. - PubMed
    1. Mitra RD, Shendure J, Olejnik J, Edyta Krzymanska O, Church GM. Fluorescent in situ sequencing on polymerase colonies. Anal Biochem. 2003;320(1):55–65. - PubMed

Publication types

Feedback