DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning

BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):524. doi: 10.1186/s12859-018-2516-4.

Abstract

Background: N6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites.

Results: We implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites.

Conculsion: We developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq .

Keywords: Deep learning; N6-methyladenosine; RNA modification.

MeSH terms

  • Adenosine / analogs & derivatives*
  • Adenosine / chemistry
  • Adenosine / genetics
  • Alternative Splicing*
  • Animals
  • Brain / metabolism
  • Carcinoma, Hepatocellular / genetics
  • Carcinoma, Hepatocellular / pathology
  • Computational Biology / methods*
  • Deep Learning*
  • Embryonic Stem Cells / cytology
  • Embryonic Stem Cells / metabolism
  • Humans
  • Liver Neoplasms / genetics
  • Liver Neoplasms / pathology
  • Methylation
  • Mice
  • RNA / analysis*
  • RNA / genetics
  • Sequence Analysis, RNA / methods*
  • Zebrafish

Substances

  • RNA
  • N-methyladenosine
  • Adenosine