Background: The recent discoveries of microRNA (miRNA) genes and characterization of the first few target genes regulated by miRNAs in Caenorhabditis elegans and Drosophila melanogaster have set the stage for elucidation of a novel network of regulatory control. We present a computational method for whole-genome prediction of miRNA target genes. The method is validated using known examples. For each miRNA, target genes are selected on the basis of three properties: sequence complementarity using a position-weighted local alignment algorithm, free energies of RNA-RNA duplexes, and conservation of target sites in related genomes. Application to the D. melanogaster, Drosophila pseudoobscura and Anopheles gambiae genomes identifies several hundred target genes potentially regulated by one or more known miRNAs.
Results: These potential targets are rich in genes that are expressed at specific developmental stages and that are involved in cell fate specification, morphogenesis and the coordination of developmental processes, as well as genes that are active in the mature nervous system. High-ranking target genes are enriched in transcription factors two-fold and include genes already known to be under translational regulation. Our results reaffirm the thesis that miRNAs have an important role in establishing the complex spatial and temporal patterns of gene activity necessary for the orderly progression of development and suggest additional roles in the function of the mature organism. In addition the results point the way to directed experiments to determine miRNA functions.
Conclusions: The emerging combinatorics of miRNA target sites in the 3' untranslated regions of messenger RNAs are reminiscent of transcriptional regulation in promoter regions of DNA, with both one-to-many and many-to-one relationships between regulator and target. Typically, more than one miRNA regulates one message, indicative of cooperative translational control. Conversely, one miRNA may have several target genes, reflecting target multiplicity. As a guide to focused experiments, we provide detailed online information about likely target genes and binding sites in their untranslated regions, organized by miRNA or by gene and ranked by likelihood of match. The target prediction algorithm is freely available and can be applied to whole genome sequences using identified miRNA sequences.