Background: Small non-coding RNAs (21 to 24 nucleotides) regulate a number of developmental processes in plants and animals by silencing genes using multiple mechanisms. Among these, the most conserved classes are microRNAs (miRNAs) and small interfering RNAs (siRNAs), both of which are produced by RNase III-like enzymes called Dicers. Many plant miRNAs play critical roles in nutrient homeostasis, developmental processes, abiotic stress and pathogen responses. Currently, only 70 miRNA have been identified in soybean.
Methods: We utilized Illumina's SBS sequencing technology to generate high-quality small RNA (sRNA) data from four soybean (Glycine max) tissues, including root, seed, flower, and nodules, to expand the collection of currently known soybean miRNAs. We developed a bioinformatics pipeline using in-house scripts and publicly available structure prediction tools to differentiate the authentic mature miRNA sequences from other sRNAs and short RNA fragments represented in the public sequencing data.
Results: The combined sequencing and bioinformatics analyses identified 129 miRNAs based on hairpin secondary structure features in the predicted precursors. Out of these, 42 miRNAs matched known miRNAs in soybean or other species, while 87 novel miRNAs were identified. We also predicted the putative target genes of all identified miRNAs with computational methods and verified the predicted cleavage sites in vivo for a subset of these targets using the 5' RACE method. Finally, we also studied the relationship between the abundance of miRNA and that of the respective target genes by comparison to Solexa cDNA sequencing data.
Conclusion: Our study significantly increased the number of miRNAs known to be expressed in soybean. The bioinformatics analysis provided insight on regulation patterns between the miRNAs and their predicted target genes expression. We also deposited the data in a soybean genome browser based on the UCSC Genome Browser architecture. Using the browser, we annotated the soybean data with miRNA sequences from four tissues and cDNA sequencing data. Overlaying these two datasets in the browser allows researchers to analyze the miRNA expression levels relative to that of the associated target genes. The browser can be accessed at http://digbio.missouri.edu/soybean_mirna/.