Father-to-offspring transmission of extremely long NOTCH2NLC repeat expansions with contractions: genetic and epigenetic profiling with long-read sequencing

Clin Epigenetics. 2021 Nov 13;13(1):204. doi: 10.1186/s13148-021-01192-5.


Background: GGC repeat expansions in NOTCH2NLC are associated with neuronal intranuclear inclusion disease. Very recently, asymptomatic carriers with NOTCH2NLC repeat expansions were reported. In these asymptomatic individuals, the CpG island in NOTCH2NLC is hypermethylated, suggesting that two factors repeat length and DNA methylation status should be considered to evaluate pathogenicity. Long-read sequencing can be used to simultaneously profile genomic and epigenomic alterations. We analyzed four sporadic cases with NOTCH2NLC repeat expansion and their phenotypically normal parents. The native genomic DNA that retains base modification was sequenced on a per-trio basis using both PacBio and Oxford Nanopore long-read sequencing technologies. A custom workflow was developed to evaluate DNA modifications. With these two technologies combined, long-range DNA methylation information was integrated with complete repeat DNA sequences to investigate the genetic origins of expanded GGC repeats in these sporadic cases.

Results: In all four families, asymptomatic fathers had longer expansions (median: 522, 390, 528 and 650 repeats) compared with their affected offspring (median: 93, 117, 162 and 140 repeats, respectively). These expansions are much longer than the disease-causing range previously reported (in general, 41-300 repeats). Repeat lengths were extremely variable in the father, suggesting somatic mosaicism. Instability is more frequent in alleles with uninterrupted pure GGCs. Single molecule epigenetic analysis revealed complex DNA methylation patterns and epigenetic heterogeneity. We identified an aberrant gain-of-methylation region (2.2 kb in size beyond the CpG island and GGC repeats) in asymptomatic fathers. This methylated region was unmethylated in the normal allele with bilateral transitional zones with both methylated and unmethylated CpG dinucleotides, which may be protected from methylation to ensure NOTCH2NLC expression.

Conclusions: We clearly demonstrate that the four sporadic NOTCH2NLC-related cases are derived from the paternal GGC repeat contraction associated with demethylation. The entire genetic and epigenetic landscape of the NOTCH2NLC region was uncovered using the custom workflow of long-read sequence data, demonstrating the utility of this method for revealing epigenetic/mutational changes in repetitive elements, which are difficult to characterize by conventional short-read/bisulfite sequencing methods. Our approach should be useful for biomedical research, aiding the discovery of DNA methylation abnormalities through the entire genome.

Keywords: DNA methylation; Long-read sequencing; NOTCH2NLC; Neuronal intranuclear inclusion disease; Repeat expansion; Single molecule epigenetic analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Methylation / genetics
  • DNA Methylation / physiology
  • Epigenesis, Genetic / genetics
  • Epigenesis, Genetic / physiology
  • Father-Child Relations*
  • Genetic Background*
  • High-Throughput Nucleotide Sequencing / methods
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Intercellular Signaling Peptides and Proteins / analysis
  • Intercellular Signaling Peptides and Proteins / genetics*
  • Nerve Tissue Proteins / analysis
  • Nerve Tissue Proteins / genetics*


  • Intercellular Signaling Peptides and Proteins
  • NOTCH2NLC protein, human
  • Nerve Tissue Proteins