Kyle Kai-How Farh 1, Andrew Grimson 1, Calvin Jan 1, Benjamin P. Lewis 2, Wendy K. Johnston 1, Lee P. Lim 3, Christopher B. Burge 4, David P. Bartel 5, *
1 Whitehead Institute for Biomedical Research, Department
of Biology, Massachusetts Institute of Technology, and Howard Hughes Medical
Institute, Nine Cambridge Center, Cambridge, MA 02142, USA.
2 Whitehead Institute for Biomedical Research, Department
of Biology, Massachusetts Institute of Technology, and Howard Hughes Medical
Institute, Nine Cambridge Center, Cambridge, MA 02142, USA; Department
of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139,
USA.
3 Rosetta Inpharmatics, a wholly owned subsidiary of
Merck and Co., 401 Terry Avenue N, Seattle, WA 98109, USA.
4 Department of Biology, Massachusetts Institute of Technology,
Cambridge, MA 02139, USA.
5 Whitehead Institute for Biomedical Research, Department
of Biology, Massachusetts Institute of Technology, and Howard Hughes Medical
Institute, Nine Cambridge Center, Cambridge, MA 02142, USA.
* To whom correspondence should be addressed.
David P. Bartel , E-mail: dbartel@wi.mit.edu
Supporting Online Material: http://www.sciencemag.org/cgi/content/full/1121158/DC1
Thousands of mammalian mRNAs are under selective pressure to maintain 7-nucleotide sites matching microRNAs (miRNAs). We find that these conserved targets are often highly expressed at developmental stages prior to miRNA expression, and that their levels fall as the miRNA that targets them begins to accumulate. Nonconserved sites, which outnumber the conserved ten-to-one, also mediate repression. As a consequence, genes preferentially expressed at the same time and place as a miRNA have evolved to selectively avoid sites matching the miRNA. This phenomenon of selective avoidance extends to thousands of genes and enables spatial and temporal specificities of miRNAs to be revealed by finding tissues and developmental stages in which messages with corresponding sites are expressed at lower levels.
We used the mouse expression atlas (6) to examine
the expression of the predicted targets of six tissue-specific miRNAs:
miR-1 and miR-133 (skeletal muscle), miR-9 and miR-124 (brain),
miR-122 (liver) and miR-142-3p (hematopoietic organs and blood
cells) [(7–10), fig.
S1]. The 250 messages with conserved miR-133 sites were generally expressed
in muscle but at lower levels in muscle than in other tissues (Fig.
1A).
Fig. 1. Gene-density maps of conserved miRNA targets.
(A) Predicted targets of miRNAs in tissues expressing the miRNAs. For muscle (large panel, left), the genes of the expression atlas were first placed in 61 equally populated bins along the x-axis and 61 equally populated bins along the y-axis. Along the x-axis genes were sorted based on whether they were expressed at low (left) or high (right) levels in muscle. Along the y-axis genes were sorted based on whether they were expressed higher (top) or lower (bottom) in muscle compared to other tissues. Predicted targets of miR-133 were then mapped onto this 61 by 61 grid. Local density (after background subtraction, fig S2, and smoothing) of miR-133 targets is color-coded, with regions of enrichment (red) or depletion (blue) shown (key at far right). Other miRNA–tissue pairs were analyzed analogously (smaller panels, right).
(B) Time course of predicted targets during myoblast (C2C12) differentiation to myotubes, analyzed using a 24 by 24 grid.
(C) Time course of predicted targets during mouse embryogenesis, analyzed as in (A). Predicted targets of let-7 are included for comparison in (B) and (C).
Likewise, predicted targets of the other miRNAs were usually at lower levels in the tissue expressing the miRNA than in other tissues (Fig. 1A). Brain-specific miR-9 and miR-124 displayed more complex patterns, perhaps reflecting the heterogeneous cell types within the brain.
The low relative expression of predicted targets in differentiated
tissues raised the question of whether they
might be more highly expressed earlier in differentiation, prior
to miRNA expression. To address this, we analyzed expression profiles of
myotube differentiation (11), during which miR-1 and
miR-133 accumulate following cell-cycle arrest (12).
Predicted targets of these muscle-specific miRNAs were preferentially high
prior to miRNA expression then dropped as the miRNAs accumulated (Fig.
1B; fig. S3). The observation
that miRNAs induced during differentiation tend to target messages highly
expressed in the previous
developmental stage suggests a function analogous to that proposed
for plants, whereby miRNAs dampen the output of pre-existing messages to
facilitate a more rapid and robust transition to a new expression program
(13). The tendency of predicted targets to be expressed
at substantial levels on the absolute scale (Fig. 1A,
x-axis) further suggested that metazoan miRNAs are often optimizing protein
output without eliminating it entirely (14).
Our results are consistent with the idea that miRNAs are destabilizing
many target messages to further define tissue-specific transcript profiles
(15) but also leave open the possibility that many targets
are repressed translationally without mRNA destabilization. If miRNAs were
usually working in concert with transcriptional and other regulatory processes
to down-regulate the same genes, then a correlation
between conserved targeting and lower mRNA levels would be observed
even for messages that miRNAs translationally repress without destabilizing.
Mammalian miRNA families have an average of ~200 conserved targets
above estimated background, a figure approximately one tenth the number
of 3' UTRs with 7-nt sites in a single genome (3, 5).
Computational algorithms rely on evolutionary conservation to distinguish
functional miRNA targets from the thousands of messages that would pair
equally well; in contrast, the cell must rely on specificity
determinants intrinsic to a single genome. To determine whether
these nonconserved sites might be functional, we used reporter assays to
compare repression mediated by conserved and nonconserved sites. We selected
two targets of miR-1, predicted by TargetScan based on conservation in
human, mouse and rat (16) and six human UTRs that had
comparable TargetScan scores in human but low or nonexistent scores in
mouse or rat. When UTR fragments of ~0.5 kilobases containing the sites
were placed in reporters,
specific repression was observed for all eight (Fig.
2A). Analogous experiments with eight predictions from our more sensitive
analysis, TargetScanS, which searches for conserved 7- or 8-nt matches
(3), and 17 genes with nonconserved matches also detected
little difference between UTR fragments containing conserved and nonconserved
sites (Fig. 2B), even when the concentration of transfected
miRNA was titrated to suboptimal levels (fig.
S4).
Fig. 2. MicroRNA-mediated repression of luciferase reporter genes
containing 3' UTR fragments with conserved or nonconserved sites.
(A) UTR fragments with TargetScan-like miR-1 sites. Luciferase activity
from HeLa cells cotransfected
with miRNA and wild-type reporters was normalized to that from cotransfection
with mutant reporters with three point substitutions disrupting each seed
match. The miR-124 transfections served as specificity controls. Error
bars represent 3 rd largest and smallest values among 12 replicates (one
asterisk, P < 0.01; two asterisks, P < 0.001,
Wilcoxon rank-sum test).
(B) UTR fragments with TargetscanS-like miR-1 (top) and miR-124 (bottom) sites, analyzed as in (A).
Apparently, most nonconserved sites fortuitously reside in local contexts suitable for mediating repression and therefore have the potential to function when exposed to the miRNA. These results generalize previous work showing that in certain contexts 7- or 8-nt matches appear sufficient for miRNA-like regulation (4, 17, 18). We conclude that additional recognition features, such as pairing to the remainder of the miRNA, accessible mRNA structure, or protein-binding sites, are usually dispensable, or occur so frequently that they impart little overall specificity (supporting online text).
To explore the impact of this vast potential for nonconserved targeting,
we examined the expression of
messages with nonconserved 7-nt matches to tissue-specific miRNAs,
focusing first on messages with sites present in mouse but not in the orthologous
human UTRs (Fig. 3A).
Fig. 3. Density maps for genes with nonconserved sites.
(A) Messages with site present in mouse UTR but absent in human ortholog. Panels are as in Figure 1, but enrichment is relative to matched cohorts (figs. S5 and S6), controlling for UTR length and nucleotide composition.
(B) Messages with site present in human UTR but absent in orthologous mouse UTR, analyzed as in (A).
To distinguish between these two possibilities, we plotted the expression,
in mouse, of genes that lacked sites in the mouse UTR but contained a site
in the human ortholog. Because such messages would not be subject to miRNA-mediated
destabilization in mouse, the depletion signal would vanish if it reflected
only direct destabilization. However, the signal persisted; mouse genes
expressed highly and specifically in the tissue were less likely to harbor
sites in their human orthologs (Fig. 3B), indicating
that genes preferentially co-expressed with the miRNA have evolved to avoid
targeting by that miRNA. The enrichment for genes
expressed at low levels also explained some of the many potentially
functional nonconserved sites; they accumulate by chance, without consequence,
in messages not co-expressed with the miRNA. The reduction in signal in
Figure 3B compared to 3A hints that
species-specific mRNA destabilization might also be frequent, presumably
as both neutral and consequential species-specific targeting.
Quantifying selective depletion of sites among messages preferentially
expressed in muscle indicated that ~420 of the 8511 genes of the expression
atlas are under selective pressure to avoid miR-133 sites. These are “antitargets,”
an anticipated class of genes not observed previously (14).
The estimated numbers of antitargets for miR-1, miR-122, miR-142, miR-9
and miR-124 were 300, 190, 170, 240, and 440, respectively—comparable to
the numbers of their conserved targets. Extrapolating to include other
miRNA families that are also highly expressed with specific spatial or
temporal expression patterns, we estimate that selective avoidance of miRNA
targeting extends to thousands of genes (supporting
online text). A signal for messages avoiding targeting in all tissue
types would be harder to detect in our analysis. For some messages, acquiring
miRNA pairing might be so detrimental that they are under selective pressure
to have
short UTRs, perhaps helping to explain why highly expressed “house-keeping”
genes have substantially shorter UTRs than do other messages (19).
In addition to revealing target avoidance, these data extend results of our heterologous reporter system (Fig. 2) into the animal, showing that 7-nt sites are often sufficient to specify a biological effect. Messages expressed highly and specifically in muscle are ~59% less likely than controls to possess a 7-nt match to muscle-specific miR-133 (Fig. 3A). For the other five miRNAs, this depletion averaged 45% (range 31–57%). This extent of depletion implies that as sites for highly expressed miRNAs emerge during sequence drift of UTRs, about half emerge in a context suitable for miRNA targeting—causing either mRNA destabilization or a selective disadvantage sufficient for preferential loss of the site from the gene pool.
Site depletion due to miRNA activity should occur specifically in
tissue types expressing the miRNA. To explore the specificity of depletion,
we used a modified Kolmogorov-Smirnov (KS) test to determine whether the
set of genes with sites in either human or mouse orthologs were expressed
at lower levels than cohorts of genes with the same estimated expectation
for having sites, controlling for UTR length and nucleotide composition.
In muscle, but not in T cells, the set of transcripts with a miR-133 site
was depleted compared to control cohorts (Fig. 4A).
Fig. 4. Depletion of sites in genes preferentially co-expressed
with the miRNA.
Fig. 4. Depletion of sites in genes preferentially co-expressed with the miRNA.
(A) miR-133 sites in skeletal muscle and CD8+ T-cells. For each panel,
genes were binned based on their expression in the indicated tissue compared
to expression in the 60 other tissues, with bin 1 lowest and bin 61 highest.
Top: difference between observed and expected number of messages
with miR-133 sites at each expression rank.
Bottom: modified KS test and estimate of significance, showing
the running sum of the difference between the observed and expected distributions
across expression ranks for messages with sites (red)
compared to control cohorts (blue).
(B) Summary map of KS tests for each miRNA-tissue pair for 28 miRNAs; P-value key is shown above. Reported expression is from zebrafish in situ data (10), supplemented with notable mammalian data (8, 9) (parentheses).
(C) Tail of P-value distribution for all 73 miRNA families (left, fig. S7) and for a mock analysis using control sequences (right). P-values greater than 10-3 , which are gray in (B), were only marginally less frequent for controls.
(D) RNA-blot analysis of miR-7 in rat tissues, reprobed for miR-124 and U6 snRNA.
Signatures for all 73 miRNA families (representing 169 human miRNA
genes) conserved among the four sequenced mammals and zebrafish were derived
(fig. S7). For many miRNA families prominently expressed in specific tissues
(7–10), the signatures corresponded to tissues in which
these miRNAs are expressed (Fig. 4B). These included
the six families featured in Figure 3, as well as let-7,
miR-99, miR-10, miR-29, and miR-153 (brain), miR-30 (kidney),
miR-194 (liver, gut, kidney), miR-141 and miR-200b (olfactory
epithelium, gut), miR-96 (olfactory epithelium), and
miR-375 (pituitary). miR-7 had highest signal in the pituitary.
This miRNA is known to be preferentially expressed in the brain (8–10),
but preferential expression in pituitary had not been noted. An RNA blot
confirmed that miR-7 expression is highest in the pituitary (Fig.
4D).
Other miRNA families, including most described as having ubiquitous,
complex, or undetectable expression
patterns, were indistinguishable from controls (Fig.
4C, fig. S7). Nonetheless,
some described as ubiquitous displayed stage-specific signatures. These
included families in the miR-17~18~19a~20~19b~92 cluster, which had a strong
embryo signature, consistent with their association with proliferation
and cancer (20, 21). The miR-302 family also had a strong
early-embryo signature, consistent with its sequence similarity to the
17~92 proliferation cluster and its expression in embryonic stem cells
(22, 23). The conserved targets of
these embryonic miRNAs were preferentially at high levels in the
oocyte and zygote then dropped to low levels in the blastocyst and embryo
(Fig. 1C), as expected if these miRNAs help dampen expression
of maternal transcripts.
A signal for motif conservation is a mainstay of bioinformatics and
previously indicated the widespread scope
of conserved miRNA targeting (3–5, 24),
but a signal for absence of a motif is unusual. The ability to observe
such a signal revealed an additional dimension to the impact of miRNAs
on UTR evolution—a widespread potential for nonconserved targeting leading
to the selective loss of many 7-nt sites. When considering conserved targeting,
nonconserved targeting, and targeting avoidance, it is hard to escape the
conclusion that miRNAs are influencing the expression or evolution of most
mammalian mRNAs.
References and Notes
1. D. P. Bartel, Cell 116, 281 (2004).
2. V. Ambros, Nature 431, 350 (2004).
3. B. P. Lewis, C. B. Burge, D. P. Bartel, Cell 120, 15 (2005).
4. J. Brennecke, A. Stark, R. B. Russell, S. M. Cohen, PLoS Biol
3, e85 (2005).
5. A. Krek et al., Nat Genet 37, 495 (2005).
6. A. I. Su et al., Proc Natl Acad Sci U S A 101, 6062 (2004).
7. M. Lagos-Quintana et al., Curr Biol 12, 735 (2002).
8. L. F. Sempere et al., Genome Biol 5, R13 (2004).
9. S. Baskerville, D. P. Bartel, RNA 11, 241 (2005).
10. E. Wienholds et al., Science 309, 310 (2005).
11. K. K. Tomczak et al., Faseb J 18, 403 (2004).
12. P. K. Rao, M. Farkhondeh, S. Baskerville, H. F. Lodish, (unpublished
data).
13. M. W. Rhoades et al., Cell 110, 513 (2002).
14. D. P. Bartel, C. Z. Chen, Nat Rev Genet 5, 396 (2004).
15. L. P. Lim et al., Nature 433, 769 (2005).
16. B. P. Lewis, I. H. Shih, M. W. Jones-Rhoades, D. P. Bartel,
C. B. Burge, Cell 115, 787 (2003).
17. J. G. Doench, P. A. Sharp, Genes Dev 18, 504 (2004).
18. E. C. Lai, B. Tam, G. M. Rubin, Genes Dev 19, 1067 (2005).
19. E. Eisenberg, E. Y. Levanon, Trends Genet 19, 362 (2003).
20. A. Ota et al., Cancer Res 64, 3087 (2004).
21. L. He et al., Nature 435, 828 (2005).
22. H. B. Houbaviy, M. F. Murray, P. A. Sharp, Dev Cell 5, 351 (2003).
23. M. R. Suh et al., Dev Biol 270, 488 (2004).
24. X. Xie et al., Nature 434, 338 (2005).
25. We thank Graham Ruby, Michael Axtell and Hannah Chang for helpful
discussions. Supported by a predoctoral fellowships from the DOE (B.P.L.)
and NSF (C.J.), and a postdoctoral fellowship and grants from the NIH (A.G.,
C.B.B., D.P.B.). D.P.B. is an HHMI Investigator.
Supporting Online Material:
http://www.sciencemag.org/cgi/content/full/1121158/DC1
Materials and Methods
SOM Text
Figs. S1 to S7
Tables S1 and S2
References
11 October 2005; accepted 10 November 2005
Published online 24 November 2005;
10.1126/science.1121158
1. Hovsepian JA, and Frenster JH, "Sense and Antisense during RNA Initiation of the DNA Transcription Bubble".
2. Ling J, Baibakov B, Pi W, Emerson BM, and Tuan D, "The HS2 Enhancer of the b-globin Locus Control Region Initiates Synthesis of Non-coding, Polyadenylated RNAs Independent of a cis-linked Globin Promoter".
3. Kuwabara T, Hsieh J, Nakashima K, Taira K, and Gage FH, "A Small Modulatory dsRNA Specifies the Fate of Adult Neural Stem Cells".
4. Ostertag EM, and Kazazian HH, "Genetics: LINEs in mind".
5. Muotri AR, Chu VT, Marchetto MCN, Deng W, Moran JV, and Gage FH, "Somatic mosaicism in neuronal precursor cells mediated by L1 retrotransposition".
6. Frenster JH, "Mechanisms of Repression and De-Repression within Interphase Chromatin".
7. De Carvalho S, "Effect
of RNA from Normal Human Marrow on Leukaemic Marrow In-Vivo".
Links to RNA and Biological Causality:
Links to
Euchromatin Activator RNA Reviews:
Links to
Euchromatin Activator RNA Research:
Links to Ultrastructural
Probes of DNase I-Sensitive Sites:
Links to
RNA as a Therapeutic Agent:
Links to Hodgkin Lymphoma
Immuno-Pathology:
Links to Activated
T-Lymphocyte Immunotherapy:
Links to Medical
Systems Biology:
Links to Selective
Gene Transcription:
Links to RNA-Induced
Epigenetics:
Links to RNA-Induced
Embryogenesis:
Links to RNA and
Biological Causality:
Links to Reprogramming
and Neoplasia:
A Brief History of Activator RNA:
"Ultrastructural
Probes of Active DNA Sites, and the RNA Activators of DNA". (PowerPoint
Presentation).