DeepMP: a deep learning tool to detect DNA base modifications on Nanopore sequencing data

Bioinformatics. 2021 Oct 28;btab745. doi: 10.1093/bioinformatics/btab745. Online ahead of print.

Abstract

Motivation: DNA Methylation plays a key role in a variety of biological processes. Recently, Nanopore long-read sequencing has enabled direct detection of these modifications. As a consequence, a range of computational methods have been developed to exploit Nanopore data for methylation detection. However, current approaches rely on a human-defined threshold to detect the methylation status of a genomic position and are not optimized to detect sites methylated at low frequency. Furthermore, most methods employ either the Nanopore signals or the basecalling errors as the model input and do not take advantage of their combination.

Results: Here we present DeepMP, a convolutional neural network (CNN)-based model that takes information from Nanopore signals and basecalling errors to detect whether a given motif in a read is methylated or not. Besides, DeepMP introduces a threshold-free position modification calling model sensitive to sites methylated at low frequency across cells. We comprehensively benchmarked DeepMP against state-of-the-art methods on E. coli, human and pUC19 datasets. DeepMP outperforms current approaches at read-based and position-based methylation detection across sites methylated at different frequencies in the three datasets.

Availability: DeepMP is implemented and freely available under MIT license at https://github.com/pepebonet/DeepMP.

Supplementary information: Supplementary data are available at Bioinformatics online.