Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear all
. 2014 Jan;42(2):e8.
doi: 10.1093/nar/gkt865. Epub 2013 Sep 25.

Long Insert Whole Genome Sequencing for Copy Number Variant and Translocation Detection

Affiliations
Free PMC article

Long Insert Whole Genome Sequencing for Copy Number Variant and Translocation Detection

Winnie S Liang et al. Nucleic Acids Res. .
Free PMC article

Abstract

As next-generation sequencing continues to have an expanding presence in the clinic, the identification of the most cost-effective and robust strategy for identifying copy number changes and translocations in tumor genomes is needed. We hypothesized that performing shallow whole genome sequencing (WGS) of 900-1000-bp inserts (long insert WGS, LI-WGS) improves our ability to detect these events, compared with shallow WGS of 300-400-bp inserts. A priori analyses show that LI-WGS requires less sequencing compared with short insert WGS to achieve a target physical coverage, and that LI-WGS requires less sequence coverage to detect a heterozygous event with a power of 0.99. We thus developed an LI-WGS library preparation protocol based off of Illumina's WGS library preparation protocol and illustrate the feasibility of performing LI-WGS. We additionally applied LI-WGS to three separate tumor/normal DNA pairs collected from patients diagnosed with different cancers to demonstrate our application of LI-WGS on actual patient samples for identification of somatic copy number alterations and translocations. With the evolution of sequencing technologies and bioinformatics analyses, we show that modifications to current approaches may improve our ability to interrogate cancer genomes.

Figures

Figure 1.
Figure 1.
Comparison of SI- and LI-WGS. A visualization of mapped reads for SI- and LI-WGS is shown assuming a read depth of 2 for each library type. The reference human genome is shown in the middle of the figure, and the location of a theoretical breakpoint is shown in gray with the location of the breakpoint marked by the gray line. SI (300 bp) mapped reads are displayed above the reference, and LI (900 bp) mapped reads are displayed below the reference. PE reads are represented by heavy solid lines with arrowheads and regions between reads are denoted by a dotted line. Anomalous read pairs are shown in red. Higher physical coverage is achieved for LI-WGS libraries when sequencing to the same read depth for SI- and LI-WGS libraries. Furthermore, by interrogating a larger genomic region using LIs, the likelihood that a breakpoint will fall within that region is increased.
Figure 2.
Figure 2.
Comparison of power achieved when sequencing LI or SI libraries. Power calculations were performed to evaluate the power achieved when sequencing SI (300 bp) libraries with a 2 × 100 read length (A). These analyses were performed to determine the power of identifying a heterozygous somatic event as characterized by at least 10 anomalous read pairs under three scenarios where a tumor sample may have three different tumor cellularities (100, 50, 25% tumor). This analysis was similarly performed for LI (900 bp) libraries with a 2 × 100 read length (B). We performed additional LI analyses using the same parameters but decreased the read length from 2 × 100 to 2 × 83 (C). For all three analyses, a dotted line demarcates the sequence coverage needed for detecting a heterozygous event in a sample with 50% tumor cellularity and 0.99 power. Coverage shown is sequence coverage, and a is the expected frequency of an event given the different tumor cellularities.
Figure 3.
Figure 3.
LI library preparation quality control. Two examples of fragmented human genomic samples to a target of 900 bp are shown (A). Fragmented samples are run alongside Invitrogen’s 1 Kb Plus DNA ladder. An example of ligation products for the LI-WGS preparation protocol is shown in (B). Products are run alongside the same 1 Kb Plus ladder shown in (C). The same gel from (B) following size selection is shown in (C) in which multiple collections of ligation product were obtained. An example Bioanalyzer trace of a final LI-WGS library is shown in (D; FU = fluorescence units). The library peak is demarcated by an arrow; flanking peaks are Bioanalyzer marker peaks.
Figure 4.
Figure 4.
Comparison of cluster sizes between SI and LI libraries. An example image from sequencing a SI library is shown in (A), along with a cluster density plot from Illumina’s Sequence Analysis Viewer. An example image and cluster density plot from sequencing a LI library is shown in (B). In each cluster density plot, the blue boxes represent total densities and the green boxes represent PF cluster densities. Red lines demarcate the median for the total density and the PF density.

Similar articles

See all similar articles

Cited by 11 articles

See all "Cited by" articles

References

    1. Meyerson M, Gabriel S, Getz G. Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 2010;11:685–696. - PubMed
    1. Tran B, Dancey JE, Kamel-Reid S, McPherson JD, Bedard PL, Brown AM, Zhang T, Shaw P, Onetto N, Stein L, et al. Cancer genomics: technology, discovery, and translation. J. Clin. Oncol. 2012;30:647–660. - PubMed
    1. Roychowdhury S, Iyer MK, Robinson DR, Lonigro RJ, Wu YM, Cao X, Kalyana-Sundaram S, Sam L, Balbin OA, Quist MJ, et al. Personalized oncology through integrative high-throughput sequencing: a pilot study. Sci. Transl. Med. 2011;3 111ra121. - PMC - PubMed
    1. Yao F, Ariyaratne PN, Hillmer AM, Lee WH, Li G, Teo AS, Woo XY, Zhang Z, Chen JP, Poh WT, et al. Long span DNA paired-end-tag (DNA-PET) sequencing strategy for the interrogation of genomic structural mutations and fusion-point-guided reconstruction of amplicons. PLoS One. 2012;7:e46152. - PMC - PubMed
    1. Lander ES, Waterman MS. Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics. 1988;2:231–239. - PubMed

Publication types

Feedback