AGTAR: A novel approach for transcriptome assembly and abundance estimation using an adapted genetic algorithm from RNA-seq data

Comput Biol Med. 2021 Aug:135:104646. doi: 10.1016/j.compbiomed.2021.104646. Epub 2021 Jul 10.

Abstract

Background: Recently, the rapid development of RNA-seq technologies has accelerated transcriptomics research. The accurate identification and quantification of transcripts based on RNA-seq data will facilitate the exploration of various potential biological mechanisms. However, due to the limitations of the current data analysis tools and RNA-seq technologies, full and accurate reconstruction of the transcriptome still faces many challenges.

Results: We developed the adapted genetic algorithm (AGTAR) program, which can reliably assemble transcriptomes and estimate abundance based on RNA-seq data with or without genome annotation files. We defined a new concept, isoform junction abundance, to help enhance the accuracy of isoform identification and quantification. Isoform abundance and isoform junction abundance are estimated by an adapted genetic algorithm. The crossover and mutation probabilities of the algorithm can be adaptively adjusted to effectively prevent premature convergence. Both simulated and real data indicated that AGTAR's comprehensive ability to assemble transcripts is significantly superior to that achievable by the currently widely used tools with similar functions.

Conclusions: AGTAR is a tool for identifying and quantifying transcripts from RNA-seq data. It has the advantages of higher accuracy and ease of use. The AGTAR package is freely available at https://github.com/v4yuezi/AGTAR.git.

Keywords: Abundance estimation; Adapted genetic algorithm; RNA-seq; Transcript assembly.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Gene Expression Profiling*
  • RNA-Seq
  • Sequence Analysis, RNA
  • Software
  • Transcriptome* / genetics