Background: Long non-coding RNAs (lncRNAs) are transcripts that are 200 bp or longer, do not encode proteins, and potentially play important roles in eukaryotic gene regulation. However, the number, characteristics and expression inheritance pattern of lncRNAs in maize are still largely unknown.
Results: By exploiting available public EST databases, maize whole genome sequence annotation and RNA-seq datasets from 30 different experiments, we identified 20,163 putative lncRNAs. Of these lncRNAs, more than 90% are predicted to be the precursors of small RNAs, while 1,704 are considered to be high-confidence lncRNAs. High confidence lncRNAs have an average transcript length of 463 bp and genes encoding them contain fewer exons than annotated genes. By analyzing the expression pattern of these lncRNAs in 13 distinct tissues and 105 maize recombinant inbred lines, we show that more than 50% of the high confidence lncRNAs are expressed in a tissue-specific manner, a result that is supported by epigenetic marks. Intriguingly, the inheritance of lncRNA expression patterns in 105 recombinant inbred lines reveals apparent transgressive segregation, and maize lncRNAs are less affected by cis- than by trans-genetic factors.
Conclusions: We integrate all available transcriptomic datasets to identify a comprehensive set of maize lncRNAs, provide a unique annotation resource of the maize genome and a genome-wide characterization of maize lncRNAs, and explore the genetic control of their expression using expression quantitative trait locus mapping.