Examining chromatin heterogeneity through PacBio long-read sequencing of M.EcoGII methylated genomes: an m6A detection efficiency and calling bias correcting pipeline

bioRxiv [Preprint]. 2023 Nov 28:2023.11.28.569045. doi: 10.1101/2023.11.28.569045.

Abstract

Recent studies have combined DNA methyltransferase footprinting of genomic DNA in nuclei with long-read sequencing, resulting in detailed chromatin maps for multi-kilobase stretches of genomic DNA from one cell. Theoretically, nucleosome footprints and nucleosome-depleted regions can be identified using M.EcoGII, which methylates adenines in any sequence context, providing a high-resolution map of accessible regions in each DNA molecule. Here we report PacBio long-read sequence data for budding yeast nuclei treated with M.EcoGII and a bioinformatic pipeline which corrects for three key challenges undermining this promising method. First, detection of m6A in individual DNA molecules by the PacBio software is inefficient, resulting in false footprints predicted by random gaps of seemingly unmethylated adenines. Second, there is a strong bias against m6A base calling as AT content increases. Third, occasional methylation occurs within nucleosomes, breaking up their footprints. After correcting for these issues, our pipeline calculates a correlation coefficient-based score indicating the extent of chromatin heterogeneity within the cell population for every gene. Although the population average is consistent with that derived using other techniques, we observe a wide range of heterogeneity in nucleosome positions at the single-molecule level, probably reflecting cellular chromatin dynamics.

Publication types

  • Preprint

Grants and funding

This research was supported by the Intramural Research Program of the NIH (NICHD).