Identification of 6-methyladenosine sites using novel feature encoding methods and ensemble models

Sci Rep. 2024 Apr 8;14(1):8180. doi: 10.1038/s41598-024-58353-8.

Abstract

N6-methyladenosine (6 mA) is the most common internal modification in eukaryotic mRNA. Mass spectrometry and site-directed mutagenesis, two of the most common conventional approaches, have been shown to be laborious and challenging. In recent years, there has been a rising interest in analyzing RNA sequences to systematically investigate mutated locations. Using novel methods for feature development, the current work aimed to identify 6 mA locations in RNA sequences. Following the generation of these novel features, they were used to train an ensemble of models using methods such as stacking, boosting, and bagging. The trained ensemble models were assessed using an independent test set and k-fold cross validation. When compared to baseline predictors, the suggested model performed better and showed improved ratings across the board for key measures of accuracy.

MeSH terms

  • Adenosine* / genetics
  • RNA* / genetics
  • RNA, Messenger
  • Research Design

Substances

  • RNA
  • RNA, Messenger
  • Adenosine