Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 19 (1), 92-105

Most Mammalian mRNAs Are Conserved Targets of microRNAs

Affiliations

Most Mammalian mRNAs Are Conserved Targets of microRNAs

Robin C Friedman et al. Genome Res.

Abstract

MicroRNAs (miRNAs) are small endogenous RNAs that pair to sites in mRNAs to direct post-transcriptional repression. Many sites that match the miRNA seed (nucleotides 2-7), particularly those in 3' untranslated regions (3'UTRs), are preferentially conserved. Here, we overhauled our tool for finding preferential conservation of sequence motifs and applied it to the analysis of human 3'UTRs, increasing by nearly threefold the detected number of preferentially conserved miRNA target sites. The new tool more efficiently incorporates new genomes and more completely controls for background conservation by accounting for mutational biases, dinucleotide conservation rates, and the conservation rates of individual UTRs. The improved background model enabled preferential conservation of a new site type, the "offset 6mer," to be detected. In total, >45,000 miRNA target sites within human 3'UTRs are conserved above background levels, and >60% of human protein-coding genes have been under selective pressure to maintain pairing to miRNAs. Mammalian-specific miRNAs have far fewer conserved targets than do the more broadly conserved miRNAs, even when considering only more recently emerged targets. Although pairing to the 3' end of miRNAs can compensate for seed mismatches, this class of sites constitutes less than 2% of all preferentially conserved sites detected. The new tool enables statistically powerful analysis of individual miRNA target sites, with the probability of preferentially conserved targeting (P(CT)) correlating with experimental measurements of repression. Our expanded set of target predictions (including conserved 3'-compensatory sites), are available at the TargetScan website, which displays the P(CT) for each site and each predicted target.

Figures

Figure 1.
Figure 1.
Method for detecting preferential conservation of miRNA sites. (A) Sites matching the miRNA seed region. All four canonical sites (colored) share six contiguous Watson–Crick matches to the miRNA seed (nucleotides 2–7); the offset 6mer contains six contiguous matches to nucleotides 3–8. (B) Phylogenetic conservation of individual sites. Each panel represents a miR-1 8mer site conserved to a branch length of 1.0. The 3′UTR of SLC35B4 falls into the fourth-most poorly conserved bin, which has a phylogeny with relatively long branch lengths (left), whereas the 3′UTR of SPRED1 falls into the most well-conserved bin, which has a phylogeny with shorter branch lengths (right). Branch segments connecting the species with sites are colored purple, with numbers indicating the lengths of the longer segments. The lengths of the segments connecting the species are summed to yield the branch-length score (equation). (C) Performance of controls matched for number of occurrences in human 3′UTRs, GC content, or expected conservation as calculated by a first-order Markov model. For each possible RNA 7mer, conservation rates (signal) and average conservation rates for 50 control 7mers (background) were calculated for all 3′UTRs at a branch-length cutoff of 1.0. Error bars indicate 25th and 75th percentiles. Because most 7mer motifs are not selectively maintained, well-performing controls should yield median signal-to-background ratios near 1.0, with low variability for every bin. (D) Nested site conservation. Because the seed-matched sites are interrelated, we subtract conserved instances of extended sites from those of shorter sites. This hypothetical site is an 8mer match to miR-1 in a human 3′UTR that is more broadly conserved as a 7mer-m8 than as an 8mer site, and as a 6mer than as a 7mer-m8 site. As a result of nested subtraction, our method considers this site an 8mer at low branch-length cutoffs but not a 6mer or 7mer. At moderate branch-length cutoffs, it switches to a conserved 7mer-m8 site, and at high branch-length cutoffs it switches to a conserved 6mer site but not a 7mer or 8mer. (E) Signal and background for the miR-1 8mer site. For each UTR bin 1 through 10, with bin 1 having the least conserved UTRs and bin 10 the most conserved, the number of miR-1 sites conserved at the indicated branch-length cutoff is plotted with estimated background (small plots). Results for all 10 bins were then combined to represent the aggregate signal and background for this site (large plot).
Figure 2.
Figure 2.
Conservation of major seed-match types. (A) Conservation of 8mer sites for 87 broadly conserved miRNA families. High-sensitivity and high-specificity cutoffs are highlighted with broken lines at 1.0 and 2.0, respectively. (B) Conservation and background estimate for mutually exclusive site types at high sensitivity (left) and high specificity (right). The signal-to-background ratio is indicated above the pair of bars. Error bars indicate one standard deviation in the estimated background, based on subsampling of individual control k-mers. (C) Efficacy of offset 6mer sites. Microarray data monitoring mRNA destabilization following transfection of 11 miRNAs was analyzed as described previously (Grimson et al. 2007). Shown is the cumulative distribution of changes for transcripts containing exactly one offset 6mer site and no other canonical sites in their 3′UTR. For comparison, previously reported analyses of messages with single canonical sites are also shown (Grimson et al. 2007). (D) Signal-to-background ratio for indicated sites at increasing branch-length cutoff. Broken lines indicate 5% lower confidence limit (z-test). (E) Correlation of site conservation rate and experimental efficacy. Fraction of sites conserved above background was calculated as ([Signal – Background]/Signal) at a branch-length cutoff of 1.0. The minimal fraction of sites conferring destabilization was determined from the cumulative distributions (C), considering the maximal vertical displacement from the no-site distribution (correcting for the bumpiness of the distributions as described previously [Grimson et al. 2007]). (F) Estimates of signal above background for the major site types. Broken lines indicate 5% lower confidence limit (z-test). (G) Aggregate conservation above background for all major site types when using using subsets of genomes. To facilitate overlay of the plots, the X-axis is signal-to-background ratio rather than branch-length cutoff. The 14-genome subset represents the non-fish species originally available in the UCSC 17-way alignments. The five-genome subset contains human, mouse, rat, dog, and chicken, and the two-genome subset contains only human and mouse.
Figure 3.
Figure 3.
Occasional preferential conservation of imperfect sites. (A) Signal-to-background ratio for sites with the indicated single-nucleotide mismatches and bulges. A mismatch or G:U wobble must occur opposite miRNA seed nucleotides 2–7. A bulge in the site must occur between bases that pair to consecutive seed nucleotides 2–7, and a site creating a bulge must involve a 7mer match that skips one of the seed nucleotides 2–7. Results for the canonical 6mer site (Fig. 2D) are included for comparison. Broken lines indicate 5% lower confidence limit (z-test). (B) Weak signal above background for a class of imperfect sites found to have a significantly positive signal-to-background ratio. Results for the canonical 6mer site (Fig. 2F) are included for comparison. Broken lines indicate 5% lower confidence limit (z-test).
Figure 4.
Figure 4.
Conserved pairing to the 3′ ends of miRNAs. (A) Preferential occurrence of pairing to the 3′ region of the miRNA, which can supplement canonical sites (diagrammed at top). Signal-to-background ratio (top two graphs) and signal above background (bottom two graphs) are plotted at conservation branch-length cutoffs of 1.0 (left) and 2.0 (right). Five percent confidence limits are shown as dashed lines. (B) Preferential occurrence of pairing to the 3′ region of the miRNA, which can compensate for mismatches or bulges in seed pairing (diagrammed at top). As in A, except that seed matches were replaced as indicated with sites containing a mismatched, G:U wobble, or bulged nucleotide.
Figure 5.
Figure 5.
Conservation of sites matching mammalian-specific miRNAs. (A) Signal-to-background ratio for sites matching 53 mammalian-specific miRNA families in 18 placental mammals, otherwise as in Figure 2D. (B) Signal above background for 8mer sites matching either broadly conserved or mammalian-specific miRNAs in 18 placental mammals, otherwise as in Figure 2F. Analysis of 8mer sites matching broadly conserved miRNAs considers either all sites (blue) or excludes those sites conserved beyond placental mammals (green). (C) Signal-to-background ratios for 8mer sites matching individual miRNAs in orthologous 3′UTRs of placental mammals at optimal sensitivity (branch-length cutoff of 0.85). For the broadly conserved miRNA set, conservation signal excludes sites conserved beyond placental mammals. Distributions expected if miRNA targeting conferred no preferential conservation were estimated using the average signal-to-background ratio of 8mer controls selected for each site, considering GC content and dinucleotide-based conservation (broken lines). Expectations differed between the two sets because of different miRNA numbers and different dinucleotide compositions.
Figure 6.
Figure 6.
Correlation of PCT with mRNA destabilization. (A) Destabilization of human messages with exactly one 7mer-m8 3′UTR site to a transfected miRNA (Grimson et al. 2007). Messages were grouped into six equal bins based on the site PCT. (B) Destabilization of human messages with exactly one 7mer-m8 3′UTR site to a transfected miRNA (Grimson et al. 2007), considering only those sites that were not conserved in either mouse, rat, or dog.

Similar articles

See all similar articles

Cited by 2,859 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback