. 2013 May 9;153(4):919-29.
Diverse Mechanisms of Somatic Structural Variations in Human Cancer Genomes
Free PMC article
Item in Clipboard
Diverse Mechanisms of Somatic Structural Variations in Human Cancer Genomes
Free PMC article
Cell. 2014 Jun 19;157(7):1736
Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.
Copyright © 2013 Elsevier Inc. All rights reserved.
Figure 1. Example of a complex deletion generated by FoSTeS/MMBIR and a pipeline for predicting SV mechanisms
(A) A complex deletion is predicted by three discordant clusters. The sequence in light blue on the reference is deleted; the sequence in red on the reference is duplicated and inserted into the deletion breakpoints. Three read pairs from the donor are shown above the donor sequence. Three discordant read pairs mapped to the reference are shown above the reference sequence. (B) Reads covering the breakpoints of insertion. The breakpoints are covered by 27 and 11 reads, respectively (only four are shown for each). Reads matching different parts of the reference genome are shown in the corresponding colors. (C) Nucleotide sequences of the reads covering the breakpoints of insertion. Black and red colors indicate the reads and the reference sequences that match each other and the grey sequences indicate unmatched references. There are a 2 bp microhomology (shown in purple) at the breakpoint on the left and a 9 bp insertion of unknown source (shown in dark green) at the breakpoint on the right. (D) Sequencing depth. Blue and red lines denote the predicted deletion and the predicted insertion donor sites, respectively, showing that the copy number is consistent with the SV call. (E) This flowchart, adapted mainly from Kidd et al. 2010, shows the breakpoint features for determining the mechanism that is likely to have generated the observed SV. Six types of mechanisms are assigned: transposable element insertion (TEI), variable number of tandem repeats (VNTR), non-homologous end joining (NHEJ), alternative end joining (alt-EJ), non-allelic homologous recombination (NAHR) and fork stalling and template switching/microhomology mediated break induced repair (FoSTeS/MMBIR). See also Figure S1 and Table S1 and S2.
Figure 2. Spectrum of somatic SV types and mechanisms
(A) Frequencies of types of somatic SVs identified in each patient. Each horizontal bar displays the number of SVs for one sample. The colored bar charts on the left show the number of events scaled by the maximum number of events (as noted) in each tumor type. The black bar charts on the right show the number of events for all patients on the same scale. A HapMap genome (NA18507) is shown at the top as an example of germline events; see Figure S2 for germline events for all patients. Most (59%) of the translocations in NA18507 are TE insertions, as described previously (Lee et al., 2012), 18% are repeat-related events including TE insertions not identified by Lee et al. 2012, and the remaining ones might be events too complex to be identified by Meerkat. (B) Frequencies of somatic deletion mechanisms. The order of the samples is the same as in (A). (C) Frequencies of somatic translocation mechanisms. The order of the samples is the same as in (A). See also Figure S2 and Table S3, S4 and S5.
Figure 3. Proportion of homologies at the breakpoints of somatic tandem duplications and complex deletions compared with NA18507
Homologies in base pairs are shown for each breakpoint as a positive number. A blunt end has a homology of 0 bp. Small insertions with unknown source are shown as negative numbers. Somatic tandem duplications and complex tandem duplications that are responsible for
EGFR and CDK4 amplifications in GBM patients are shown in a separate category. See also Figure S3.
CDKN2A/B losses in GBM patients
Profiles in the lower part of the plots show copy ratios (tumor vs. matched normal). Above the copy ratio profiles, predicted somatic SVs are represented by lines with the breakpoints indicated by dots. SVs corresponding to a notable copy number change are colored, with the color indicating the orientation of the breakpoints. A red cluster typically suggests a tandem duplication; a blue cluster typically suggests a deletion. The number of supporting discordant read pairs for each SV is shown on the left using the same color-coding. The copy-loss regions are highlighted with blue shades. (A) GBM0208, an arm level loss and a focal deletion. (B) GBM1086, two focal deletions. (C) GBM0648, complex rearrangements. See also Figure S5.
EGFR amplifications in GBM patients
SVs and copy ratios are displayed as described in Figure 4. The copy-loss and gain regions are highlighted with blue and red shades, respectively. (A) GBM0155, three tandem duplications. (B) GBM0145, one tandem duplication and a deletion with insertion at the breakpoints. Two vertical black lines connecting two single events denote a complex deletion, which was predicted by combining two discordant read pair clusters. The solid blue and red lines represent segments that have been deleted and duplicated. The dashed lines denote a region of no copy number change. (C) GBM0214, one tandem duplication and complex rearrangements. See also Figure S6.
Figure 6. Amplifications of
EGFR and chromosome 12 in GBM0152
(A) Copy ratio and rearrangements involving
EGFR. Colored boxes with arrows denote the amplified regions and their orientations. (B) Diagram of the resulting rearrangements. Three segments of DNA from chromosome 7 and chromosome 12 are merged into one and tandem-duplicated. (C) Copy ratio and somatic rearrangements on chromosome 12. The three grey dashed lines in copy ratio panel (bottom of this figure) denote copy ratios of 40, 75 and 110. The rearrangements marked by “a”, “b”, “c” and “d” have approximately twice as many supporting discordant read pairs as other rearrangements. These rearrangements are also marked in (D), (E) and (F). (D) The 14 Mb region of chromosome 12 shown in (C) was segmented according to copy ratios. Each segment was re-scaled and assigned an identifier from 0 to 40. The rearrangement marked with a black arrow is not involved in the amplifications of other segments on chromosome 12, but is involved in the amplification of EGFR on chromosome 7 as displayed in (A). (E) Each segment in (D) is shown as a numbered node connected by arrows and lines. Black arrows connected by lines denote concordant connections. Ratios of segments are denoted by the number of dots above the segment IDs inside each node. Non-amplified segments are not shown. The connection marked with “e” (also marked in (F)) is a germline deletion. (F) This diagram shows one possible solution on how segments are connected. Segments with a white background are in an inverted orientation. Colored dashed lines denote discordant connections while black lines denote concordant connections.
Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms.
Genome Res. 2013 May;23(5):762-76. doi: 10.1101/gr.143677.112. Epub 2013 Feb 14.
Genome Res. 2013.
23410887 Free PMC article.
Genomic sequencing in cancer.
Cancer Lett. 2013 Nov 1;340(2):161-70. doi: 10.1016/j.canlet.2012.11.004. Epub 2012 Nov 23.
Cancer Lett. 2013.
23178448 Free PMC article.
Reconstructing cancer genomes from paired-end sequencing data.
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S10. doi: 10.1186/1471-2105-13-S6-S10.
BMC Bioinformatics. 2012.
22537039 Free PMC article.
An integrative characterization of recurrent molecular aberrations in glioblastoma genomes.
Nucleic Acids Res. 2013 Oct;41(19):8803-21. doi: 10.1093/nar/gkt656. Epub 2013 Jul 31.
Nucleic Acids Res. 2013.
23907387 Free PMC article.
Somatic structural variation and cancer.
Brief Funct Genomics. 2015 Sep;14(5):339-51. doi: 10.1093/bfgp/elv016. Epub 2015 Apr 21.
Brief Funct Genomics. 2015.
Algorithmic approaches to clonal reconstruction in heterogeneous cell populations.
Quant Biol. 2019 Dec;7(4):255-265. doi: 10.1007/s40484-019-0188-3. Epub 2019 Dec 7.
Quant Biol. 2019.
32431959 Free PMC article.
HiNT: a computational method for detecting copy number variations and translocations from Hi-C data.
Genome Biol. 2020 Mar 23;21(1):73. doi: 10.1186/s13059-020-01986-5.
Genome Biol. 2020.
32293513 Free PMC article.
MACROD2 deficiency promotes hepatocellular carcinoma growth and metastasis by activating GSK-3β/β-catenin signaling.
NPJ Genom Med. 2020 Apr 1;5:15. doi: 10.1038/s41525-020-0122-7. eCollection 2020.
NPJ Genom Med. 2020.
32257385 Free PMC article.
Pan-Cancer Analysis Reveals the Diverse Landscape of Novel Sense and Antisense Fusion Transcripts.
Mol Ther Nucleic Acids. 2020 Mar 6;19:1379-1398. doi: 10.1016/j.omtn.2020.01.023. Epub 2020 Jan 29.
Mol Ther Nucleic Acids. 2020.
32160708 Free PMC article.
Patterns of somatic structural variation in human cancer genomes.
Nature. 2020 Feb;578(7793):112-121. doi: 10.1038/s41586-019-1913-9. Epub 2020 Feb 5.
Research Support, N.I.H., Extramural
Genome-Wide Association Study
LinkOut - more resources
Full Text Sources Other Literature Sources Research Materials Miscellaneous