Computational methods for annotation of plant regulatory non-coding RNAs using RNA-seq

Brief Bioinform. 2021 Jul 20;22(4):bbaa322. doi: 10.1093/bib/bbaa322.

Abstract

Plant transcriptome encompasses numerous endogenous, regulatory non-coding RNAs (ncRNAs) that play a major biological role in regulating key physiological mechanisms. While studies have shown that ncRNAs are extremely diverse and ubiquitous, the functions of the vast majority of ncRNAs are still unknown. With ever-increasing ncRNAs under study, it is essential to identify, categorize and annotate these ncRNAs on a genome-wide scale. The use of high-throughput RNA sequencing (RNA-seq) technologies provides a broader picture of the non-coding component of transcriptome, enabling the comprehensive identification and annotation of all major ncRNAs across samples. However, the detection of known and emerging class of ncRNAs from RNA-seq data demands complex computational methods owing to their unique as well as similar characteristics. Here, we discuss major plant endogenous, regulatory ncRNAs in an RNA sample followed by computational strategies applied to discover each class of ncRNAs using RNA-seq. We also provide a collection of relevant software packages and databases to present a comprehensive bioinformatics toolbox for plant ncRNA researchers. We assume that the discussions in this review will provide a rationale for the discovery of all major categories of plant ncRNAs.

Keywords: RNA-seq; circRNA; lncRNA; miRNA; ncRNAs; sRNA-seq; siRNA; tsRNA.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Databases, Nucleic Acid*
  • Gene Expression Regulation, Plant*
  • Plants* / genetics
  • Plants* / metabolism
  • RNA, Plant* / biosynthesis
  • RNA, Plant* / genetics
  • RNA, Untranslated* / biosynthesis
  • RNA, Untranslated* / genetics
  • RNA-Seq*
  • Software*

Substances

  • RNA, Plant
  • RNA, Untranslated