Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 28 (5), 666-675

Mutational Signatures of DNA Mismatch Repair Deficiency in C. elegans and Human Cancers

Affiliations

Mutational Signatures of DNA Mismatch Repair Deficiency in C. elegans and Human Cancers

Bettina Meier et al. Genome Res.

Abstract

Throughout their lifetime, cells are subject to extrinsic and intrinsic mutational processes leaving behind characteristic signatures in the genome. DNA mismatch repair (MMR) deficiency leads to hypermutation and is found in different cancer types. Although it is possible to associate mutational signatures extracted from human cancers with possible mutational processes, the exact causation is often unknown. Here, we use C. elegans genome sequencing of pms-2 and mlh-1 knockouts to reveal the mutational patterns linked to C. elegans MMR deficiency and their dependency on endogenous replication errors and errors caused by deletion of the polymerase ε subunit pole-4 Signature extraction from 215 human colorectal and 289 gastric adenocarcinomas revealed three MMR-associated signatures, one of which closely resembles the C. elegans MMR spectrum and strongly discriminates microsatellite stable and unstable tumors (AUC = 98%). A characteristic difference between human and C. elegans MMR deficiency is the lack of elevated levels of NCG > NTG mutations in C. elegans, likely caused by the absence of cytosine (CpG) methylation in worms. The other two human MMR signatures may reflect the interaction between MMR deficiency and other mutagenic processes, but their exact cause remains unknown. In summary, combining information from genetically defined models and cancer samples allows for better aligning mutational signatures to causal mutagenic processes.

Figures

Figure 1.
Figure 1.
Mutations in C. elegans wild-type and MMR mutants grown for 10 or 20 generations. Identical base substitutions as well as indels occurring in the same genomic location among samples of the entire data set (duplicates) were excluded from the analysis, thus only reporting mutations unique to each individual sample. (A) Number and types of base substitutions identified in the parental (P0) or one first generation (F1) line and three independently propagated F20 lines of wild-type, mlh-1, pms-2, and pole-4 single mutants. (B) Number and types of insertions and deletions (indels) identified in initial (P0 or F1) and three independently propagated F20 lines of wild-type, mlh-1, pms-2, and pole-4 single mutants. (C) Number and types of base substitutions observed in the parental (P0) line and 2–3 independently propagated F10 lines of wild-type, pms-2 and pole-4 single, and pole-4; pms-2 double mutants. (D) Number and type of indels observed in the parental (P0) and 2–3 independently propagated F10 lines of wild-type, pms-2 and pole-4 single, and pole-4; pms-2 double mutants. (E) Average number of base substitutions identified across all individual lines per genotype in their 5′ and 3′ base sequence context in mlh-1 and pms-2 single and in pms-2 single and pole-4; pms-2 double mutants. Error bars represent the standard error of the mean. (F) Examples of indel sequence contexts. Sequence reads aligned to the reference genome WBcel235.74 visualized in Integrative Genomics Viewer (Robinson et al. 2011). A 1-bp (left) and a 2-bp deletion (right) are shown. A subset of sequence reads, which end close to an indel, erroneously aligned across the indel resulting either in wild-type bases (arrow) or base changes (arrowheads). Such wrongly called base substitutions were removed during filtering (Methods) using the deepSNV package (Gerstung et al. 2012, 2014).
Figure 2.
Figure 2.
Correlation between homopolymer length and the frequency of +1/−1 bp indels. (A) Distribution of homopolymer repeats encoded in the C. elegans genome by length and DNA base shown in log10 scale (left) and the relative percentage of A, C, G, and T homopolymers in the genome (right). (B) Average number of 1-bp indels in homopolymer runs for mlh-1 F20, pms-2 F20, pms-2 F10, and pole-4; pms-2 F10 mutant lines by homopolymer length. (C) Generalized additive spline model (GAM) fit for the ratio of 1-bp indels normalized to the frequency of homopolymers (HPs) in the genome. The average frequency observed across three lines is depicted as a gray dot; gray bars indicate the 95% confidence interval. The red line indicates best fit. Red dotted lines represent the corresponding 95% confidence interval.
Figure 3.
Figure 3.
Identification of de novo signatures from human colorectal and gastric adenocarcinoma samples (COAD-US and STAD-US projects) and their contribution to samples clinically classified as microsatellite instable (MSI) or microsatellite stable (MSS). (A) Two-dimensional representation of the mutational profile composition across cancer samples. The size of each circle reflects the mutation burden. MSI samples are highlighted by a bold, black outline. The color of segments reflects the signature composition. (B) Mutational signatures including base substitutions and 1-bp indels derived from the combined COAD-US and STAD-US data sets. (C) Relative contribution of MMR-1, MMR-2, and MMR-3 signatures to cancer samples clinically classified as MSI or MSS. Box plot with outliers shown as individual filled circles. Area under the curve (AUC) value for MMR-1 contribution indicates the probability of a random MSI sample having higher MMR-1 contribution than a random MSS sample. (D) Number of mutations assigned to signatures MMR-1 (green), MMR-2 (purple), and MMR-3 (orange) plotted against the number of 1-bp indels in the same sample. (E) Fold change in the average number of mutations assigned to different signatures in MSI samples compared to MSS samples. As expected, the number of POLE-related mutations is higher in MSS samples as all POLE-deficient tumors are MSS. Apart from MMR signatures, Clock-1 signature also contributes over 10 times more mutations to MSI samples than to MSS. SNP associated mutations are likely due to unfiltered SNPs that are prevalent in the human population (Supplemental Material).
Figure 4.
Figure 4.
Mutational patterns derived from C. elegans MMR mutants and their comparison to the de novo human signature MMR-1. (A) Base substitution patterns of C. elegans mlh-1, pms-2, and pole-4; pms-2 mutants and their corresponding humanized versions (mirrored). (B) Relative abundance of trinucleotides in the C. elegans genome (red) and the human exome (light blue). (C) MMR-1 base substitution signature compared to pms-2 and mlh-1 mutational patterns adjusted to human whole-exome trinucleotide frequency. Stars indicate the difference in C > T transitions at CpG sites, which occur at lower frequency in C. elegans.

Similar articles

See all similar articles

Cited by 10 articles

See all "Cited by" articles

Publication types

MeSH terms

Substances

Feedback