A systemic identification approach for primary transcription start site of Arabidopsis miRNAs from multidimensional omics data

Funct Integr Genomics. 2017 May;17(2-3):353-363. doi: 10.1007/s10142-016-0541-9. Epub 2016 Dec 28.

Abstract

The 22-nucleotide non-coding microRNAs (miRNAs) are mostly transcribed by RNA polymerase II and are similar to protein-coding genes. Unlike the clear process from stem-loop precursors to mature miRNAs, the primary transcriptional regulation of miRNA, especially in plants, still needs to be further clarified, including the original transcription start site, functional cis-elements and primary transcript structures. Due to several well-characterized transcription signals in the promoter region, we proposed a systemic approach integrating multidimensional "omics" (including genomics, transcriptomics, and epigenomics) data to improve the genome-wide identification of primary miRNA transcripts. Here, we used the model plant Arabidopsis thaliana to improve the ability to identify candidate promoter locations in intergenic miRNAs and to determine rules for identifying primary transcription start sites of miRNAs by integrating high-throughput omics data, such as the DNase I hypersensitive sites, chromatin immunoprecipitation-sequencing of polymerase II and H3K4me3, as well as high throughput transcriptomic data. As a result, 93% of refined primary transcripts could be confirmed by the primer pairs from a previous study. Cis-element and secondary structure analyses also supported the feasibility of our results. This work will contribute to the primary transcriptional regulatory analysis of miRNAs, and the conserved regulatory pattern may be a suitable miRNA characteristic in other plant species.

Keywords: Arabidopsis; Cis-element; Epigenomics; Intergenic miRNA; Primary transcription start site.

MeSH terms

  • Arabidopsis / genetics*
  • MicroRNAs / genetics*
  • Transcription Initiation Site*

Substances

  • MicroRNAs