Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Mar 1:8:69.
doi: 10.1186/1471-2105-8-69.

Inference of miRNA targets using evolutionary conservation and pathway analysis

Affiliations

Inference of miRNA targets using evolutionary conservation and pathway analysis

Dimos Gaidatzis et al. BMC Bioinformatics. .

Erratum in

  • BMC Bioinformatics. 2007 Jul 12;8(1):248

Abstract

Background: MicroRNAs have emerged as important regulatory genes in a variety of cellular processes and, in recent years, hundreds of such genes have been discovered in animals. In contrast, functional annotations are available only for a very small fraction of these miRNAs, and even in these cases only partially.

Results: We developed a general Bayesian method for the inference of miRNA target sites, in which, for each miRNA, we explicitly model the evolution of orthologous target sites in a set of related species. Using this method we predict target sites for all known miRNAs in flies, worms, fish, and mammals. By comparing our predictions in fly with a reference set of experimentally tested miRNA-mRNA interactions we show that our general method performs at least as well as the most accurate methods available to date, including ones specifically tailored for target prediction in fly. An important novel feature of our model is that it explicitly infers the phylogenetic distribution of functional target sites, independently for each miRNA. This allows us to infer species-specific and clade-specific miRNA targeting. We also show that, in long human 3' UTRs, miRNA target sites occur preferentially near the start and near the end of the 3' UTR. To characterize miRNA function beyond the predicted lists of targets we further present a method to infer significant associations between the sets of targets predicted for individual miRNAs and specific biochemical pathways, in particular those of the KEGG pathway database. We show that this approach retrieves several known functional miRNA-mRNA associations, and predicts novel functions for known miRNAs in cell growth and in development.

Conclusion: We have presented a Bayesian target prediction algorithm without any tunable parameters, that can be applied to sequences from any clade of species. The algorithm automatically infers the phylogenetic distribution of functional sites for each miRNA, and assigns a posterior probability to each putative target site. The results presented here indicate that our general method achieves very good performance in predicting miRNA target sites, providing at the same time insights into the evolution of target sites for individual miRNAs. Moreover, by combining our predictions with pathway analysis, we propose functions of specific miRNAs in nervous system development, inter-cellular communication and cell growth. The complete target site predictions as well as the miRNA/pathway associations are accessible on the ElMMo web server.

PubMed Disclaimer

Figures

Figure 1
Figure 1
MiRNA seed types and conservation fold enrichment. Schematic representation of the different "seed types" of miRNA target sites that we consider and conservation fold enrichment for each of them. a. Seed type interactions of miRNA-mRNA hybrids (see text). b. Conservation fold enrichment for the 9 different seed types in the four clades.
Figure 2
Figure 2
Examples of inferred phylogenetic distributions of functional target sites. Comparison of the inferred phylogenetic distribution of functional target sites across vertebrate species (human – H. sapiens, chimp – P. troglodytes, rhesus maccaque – M. mulatta, mouse – M. musculus, rat – R. norvegicus, cow – B. taurus, dog – C. familiaris, opossum – M. domestica, chicken – G. gallus) for 4 different miRNAs. Starting from human at the root the thickness of the branches of the tree represents the fraction of putative target sites inferred to be selected along that branch of the tree. The bars at each internal node indicate what fraction of sites remains under selection in both descending branches (green), only the left descending branch (red), and only the right descending branch (blue). For each of the human miRNAs shown in this figure, there exists at least a miRNA with the same 1–8 "seed" sequence in all vertebrates in the tree.
Figure 3
Figure 3
miRNA seed matches under functional selection. The fraction ρ of seed matches inferred to be under selection (vertical axis) vs. the total number of sites inferred to be under selection in the entire set of mRNAs (horizontal axis) for individual miRNAs. Each star corresponds to one miRNA and each panel corresponds to one clade of species, with the reference species indicated at the top.
Figure 4
Figure 4
Performance comparison with other methods. Comparison of the performance of our method and other published methods on a set of 120 experimentally tested miRNA-mRNA interactions in fly. Specificity (fraction of negatives that are not predicted) is shown as a function of sensitivity (fraction of positives that are predicted) for our method at different cutoffs in posterior probability (black line) and for other methods (colored dots).
Figure 5
Figure 5
Location bias of predicted miRNA target sites in UTRs. Distribution of predicted miRNA target sites in the 3'UTRs. Each predicted miRNA target site is represented by a dot with the x-coordinate corresponding to the length of the associated 3'UTR and the y-coordinate corresponding to the localization of the site within the 3'UTR normalized from 0(start) to 1(end). Gaussian kernels around all the dots were used to create a smooth interpolating density surface. Since the general UTR length distribution is not uniform, we normalized the vertical slices through the 2-D density surface p(x, y) at each x-coordinate to obtain p(y|x).
Figure 6
Figure 6
Location of predicted target sites of individual miRNAs in the 3' UTRs. Histogram of the relative position (0(start) to 1(end)) of high-probability predicted target sites (posterior probability ≥ 0.5) for 6 individual miRNAs in the human 3'UTRs longer than 4 kb. The identity of the miRNAs and their corresponding seed sequences (positions 1–8 from the 5' end of the mature miRNA) are indicated on each panel.
Figure 7
Figure 7
Pathway analysis. Representation of individual pathways among the predicted targets of a given miRNA. Each column corresponds to a KEGG pathway and each row to a group of miRNAs with the same seed sequence. Red indicates overrepresentation of the targets of a specific miRNA among the genes in the corresponding pathway, whereas blue indicates depletion. The intensity of the color indicates the posterior probability of the dependent model (see Methods). Pathways have been grouped in larger functional categories according to the KEGG annotation. Only miRNAs with at least one significant association are shown.
Figure 8
Figure 8
Modeling the selection pressure on miRNA target sites. a. The phylogenetic tree of the species in the clade (here flies) is rooted at the reference species (here melanogaster) and selection is modeled starting from the root and moving down the tree (see Methods for details). At each internal node k there are probabilities for selection to be maintained in one or both children of the node (see Methods for details). b. Relationship between selection and conservation patterns: Example of a selection pattern on a particular set of orthologous target sites in flies. Open circles indicate absence of selection pressure, closed circles indicate presence of selection pressure. Selection pressure is absent in Drosophila ananassae, mojavensis and virilis (D.ananassae, D.mojavensis, and D.virilis). The possible conservation patterns consistent with the selection pattern for this target site are listed in the table. The site needs to be conserved in all species in which selection pressure operates, namely Drosophila simulans, yakuba and pseudoobscura (D.simulans, D.yakuba, D.Pseudoobscura). In the species in which selection pressure does not operate, the site may or may not be conserved.

Similar articles

Cited by

References

    1. Lee R, Feinbaum R, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75:843–854. doi: 10.1016/0092-8674(93)90529-Y. - DOI - PubMed
    1. Reinhart B, Slack F, Basson M, Pasquinelli A, Bettinger J, Rougvie A, Horvitz H, Ruvkun G. The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature. 2000;403:901–906. doi: 10.1038/35002607. - DOI - PubMed
    1. Lee R, Ambros V. An extensive class of small RNAs in Caenorhabditis elegans. Science. 2001;294:797–799. doi: 10.1126/science.1066315. - DOI - PubMed
    1. Lau N, Lim L, Weinstein E, Bartel D. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 2001;294:858–862. doi: 10.1126/science.1065062. - DOI - PubMed
    1. Lagos-Quintana M, Rauhut R, Yalcin A, Meyer J, Lendeckel W, Tuschl T. Identification of tissue-specific microRNAs from mouse. Curr Biol. 2002;12:735–739. doi: 10.1016/S0960-9822(02)00809-6. - DOI - PubMed

Publication types

LinkOut - more resources