Computational identification and functional predictions of long noncoding RNA in Zea mays

PLoS One. 2012;7(8):e43047. doi: 10.1371/journal.pone.0043047. Epub 2012 Aug 16.

Abstract

Background: Computational analysis of cDNA sequences from multiple organisms suggests that a large portion of transcribed DNA does not code for a functional protein. In mammals, noncoding transcription is abundant, and often results in functional RNA molecules that do not appear to encode proteins. Many long noncoding RNAs (lncRNAs) appear to have epigenetic regulatory function in humans, including HOTAIR and XIST. While epigenetic gene regulation is clearly an essential mechanism in plants, relatively little is known about the presence or function of lncRNAs in plants.

Methodology/principal findings: To explore the connection between lncRNA and epigenetic regulation of gene expression in plants, a computational pipeline using the programming language Python has been developed and applied to maize full length cDNA sequences to identify, classify, and localize potential lncRNAs. The pipeline was used in parallel with an SVM tool for identifying ncRNAs to identify the maximal number of ncRNAs in the dataset. Although the available library of sequences was small and potentially biased toward protein coding transcripts, 15% of the sequences were predicted to be noncoding. Approximately 60% of these sequences appear to act as precursors for small RNA molecules and may function to regulate gene expression via a small RNA dependent mechanism. ncRNAs were predicted to originate from both genic and intergenic loci. Of the lncRNAs that originated from genic loci, ∼20% were antisense to the host gene loci.

Conclusions/significance: Consistent with similar studies in other organisms, noncoding transcription appears to be widespread in the maize genome. Computational predictions indicate that maize lncRNAs may function to regulate expression of other genes through multiple RNA mediated mechanisms.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / methods*
  • RNA, Long Noncoding / genetics*
  • RNA, Plant / genetics*
  • Zea mays / genetics*

Substances

  • RNA, Long Noncoding
  • RNA, Plant

Grants and funding

This work was funded by the National Science Foundation and Genome Systems grant through award number MCB-027129 to KMM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.