Accurate cross-species 5mC detection for Oxford Nanopore sequencing in plants with DeepPlant

Nat Commun. 2025 Apr 4;16(1):3227. doi: 10.1038/s41467-025-58576-x.

Abstract

Nanopore sequencing enables comprehensive detection of 5-methylcytosine (5mC), particularly in repeat regions. However, CHH methylation detection in plants is limited by the scarcity of high-methylation positive samples, reducing generalization across species. Dorado, the only tool for plant 5mC detection on the R10.4 platform, lacks extensive species testing. Here, we develop DeepPlant, a deep learning model incorporating both Bi-LSTM and Transformer architectures, which significantly improves CHH detection accuracy and performs well for CpG and CHG motifs. We address the scarcity of methylation-positive CHH training samples through screening species with abundant high-methylation CHH sites using bisulfite-sequencing and generate datasets that cover diverse 9-mer motifs for training and testing DeepPlant. Evaluated across nine species, DeepPlant achieves high whole-genome methylation frequency correlations (0.705-0.838) with BS-seq data on CHH, improved by 23.4- 117.6% compared to Dorado. DeepPlant also demonstrates superior single-molecule accuracy and F1 score, offering strong generalization for plant epigenetics research.

MeSH terms

  • 5-Methylcytosine* / analysis
  • 5-Methylcytosine* / metabolism
  • CpG Islands / genetics
  • DNA Methylation
  • DNA, Plant / genetics
  • Deep Learning
  • Epigenesis, Genetic
  • Genome, Plant / genetics
  • Nanopore Sequencing* / methods
  • Plants* / genetics

Substances

  • 5-Methylcytosine
  • DNA, Plant