Michael Boutros 1*, Amy A. Kiger 1*, Susan Armknecht 1, 2, Kim Kerr 1, 2, Marc Hild 3, Britta Koch 3, Stefan A. Haas 4, Heidelberg Fly Array Consortium 3 (Marc Hild, Boris Beckmann, Stefan Haas, Britta Koch, Martin Vingron, Frank Sauer, Jörg Hoheisel, and Renato Paro), Renato Paro 3, Norbert Perrimon 1, 2, @
1Department of Genetics, Harvard Medical School, Boston,
MA 02115, USA.
2 Howard Hughes Medical Institute (HHMI), Harvard Medical
School, Boston, MA 02115, USA.
3 Zentrum für Molekulare Biologie der Universität
Heidelberg, D-69120 Heidelberg, Germany. The Heidelberg Fly Array Consortium
consists of Marc Hild, Boris Beckmann, Stefan Haas, Britta Koch, Martin
Vingron, Frank Sauer, Jörg Hoheisel, and Renato Paro.
4 Max-Planck Institute for Molecular Genetics, D-14195
Berlin, Germany.
* These authors contributed equally to this work.
@ To whom correspondence should be addressed.
E-mail: perrimon@rascal.med.harvard.edu
A crucial aim upon completion of whole genome sequences is the functional
analysis of all predicted genes. We have applied a high-throughput RNA-interference
(RNAi)
screen of 19,470 double-stranded (ds) RNAs in cultured cells
to characterize the function of nearly all (91%) predicted Drosophila
genes in cell growth and viability. We found 438 dsRNAs that identified
essential genes, among which 80% lacked mutant alleles. A
quantitative assay of cell number was applied to identify genes
of known and uncharacterized functions. In particular, we demonstrate a
role for the homolog of a mammalian acute myeloid leukemia gene (AML1)
in cell survival. Such a systematic screen for cell phenotypes, such as
cell viability, can thus be effective in characterizing functionally related
genes on a genome-wide scale.
Fig. 1. Genome-wide RNAi screens result in highly reproducible growth and viability defects. Results for Kc167 cells are shown; similar results were observed for S2R+ cells (22) (table S1).
(A) Left panel: Luciferase activity (relative light units) indicative of ATP levels is correlated with number of Drosophila cells in a high-throughput assay format (fig. S1). Right panel: Treatment with dsRNA targeting D-IAP1 induced time-dependent decrease in cell viability is shown as the relative readout as compared with cells treated with control green fluorescent protein (gfp) dsRNA (normalized to 1).
(B) Fluorescence microscopy of cells after 3 days RNAi. More dying cells were detected after treatment with D-IAP1 than with control dsRNAs by the ratio of SYTOX green–labeled nuclei (green and lower panel) versus Hoechst 33342–labeled nuclei (red).
(C) Results from one genome-wide RNAi screen, after 5 days dsRNA
treatment. Each RNAi experiment is
represented by a shaded box (a single well), arranged by 384-well
plates as outlined in upper left. Results in each plate were mean-centered
before overall analysis. Gray values indicate z score, with darker shades
representing below-average results. Each 384-well plate had four control
wells containing either D-IAP1 or the negative controls gfp,
Rho1,
or no dsRNAs. The D-IAP1 control phenotypes are evident as the dark
boxes in the upper left corner of each plate, indicative of dying cells
and a lower signal.
(D) Example of highly reproducible phenotypes with similar z
scores from two independent RNAi screens
[enlarged from (C) and from duplicate screen in
fig. S2].
Fig. 2. Similar quantitative RNAi phenotypes of genes encoding ribosomal proteins. Averaged RNAi phenotypes of 72 genes encoding all annotated ribosomal proteins tested (gray bars) are distinguishable from negative controls (white bars, gfp dsRNA) and more severe phenotypes (black bars, D-IAP1 dsRNA). The gfp and D-IAP1 results represent negative and positive control experiments (scored one per plate) over a genome-wide screen. Intergroup comparisons are highly significant in a Student's t test (P < 0.0001) (fig. S3).
Fig. 3. Quantitative grouping of RNAi phenotypes.
(A) Distribution of the frequency of RNAi phenotypes recovered for each specified range of z scores. We used a z score of three or more standard deviations from the mean as a threshold to select 438 results for further analysis (tables S1 and S10).
(B and C) Frequency of encoded functional groups as predicted by InterPro protein domains and manually assigned to representative categories (tables S1 and S4) for all selected phenotypes [(B), z score > 3] and the most severe phenotypes [(C), z score > 5], revealing significant changes in the abundance of predicted ribosome proteins (P < 0.001) and proteins with no predicted domains (P < 0.00001). z scores were averaged across experiments.
(D) Classification of quantitative RNAi phenotypes of selected genes (rows) identifies groups of related and new gene functions, as determined from duplicate screens per cell type (columns) and visualized by z score (scale, bottom). Both D-IAP1* (added control) and D-IAP1 (within RNAi library) yield equivalent phenotypes.
Of the 438 cases, 47% (206 out of 438) had an associated Gene Ontology annotation (17), and 59% (260 out of 438) encoded an identifiable InterPro protein domain (18). When the most abundant domain predictions were used to categorize genes into distinct functional classes (Fig. 3B; individual predictions and assignments in tables S1 and S4) (12), the relative distribution of predicted gene functions differed with the quantitative severity of the RNAi phenotypes (Fig. 3, B and C).
The phenotypic screen also identified genes encoding sets of proteins
in known biochemical complexes, as revealed by sequence-based classification
(Fig. 3B). Examples of this were within two of the most
abundant categories pertaining to protein translation (56 genes, "Ribosome")
and ubiquitylation and protein degradation (34 genes, "Proteasome"). Genes
with predicted roles in the cell cycle showed quantitative phenotypes similar
to those involved in protein translation, but in only one of the two cell
types screened ("Cell Cycle," Fig. 3D; supporting
online material text). One of the most populated categories consisted
of genes for 62 proteins with predicted DNA binding domains ("DNA binding,"
Fig.
3B, table S1), including chromatin-related
factors (e.g., core Histone and high-mobility group (HMG)–box domains:
bss and CG17836; Fig. 3D) and representative
members of transcription factor families (e.g., homeobox, ets, and AML
domains: abd-A, aop, and CG15455;
Fig. 3D). Only genes for specific transcription
factors from within different families were identified. For example, although
four different AML-like genes are encoded within the fly genome, only one,
CG15455,
was functionally identified in the screen (fig.
S4). Serpent, a GATA-1 homolog with roles in fly and mammalian blood
cell development and survival (19), was identified as
the only one of five predicted GATA-type Zinc-finger transcription factors
(srp, Fig. 3D). Proteins with a predicted DNA
binding domain comprised the largest assigned category of genes identified,
both in total (14%, Fig. 3B) and in the class with the
most severe phenotypes (19%, z score > 5, Fig. 3C),
an enrichment from the proportion found in the genome (5%).
Overall, the largest category of genes (41%) had no recognizable predicted protein domain (178 genes, "No Prediction," Fig. 3B), suggesting that the screen identified many uncharacterized genes with essential cellular roles (table S1). For example, severe cell viability phenotypes (z scores 6.8 and 7.3) were observed with HFA13298dsRNA targeted against a newly predicted gene, HDC14318, with six overlapping expressed-sequence tags mapped to the same region (RH13972, RH23223, RH26651, RH22174, RH26647, and RH62785). The proportion of 438 genes with phenotypes but without a predicted protein domain increased with phenotypic severity (63% "No Prediction," z score > 5; Fig. 3C).
We also identified uncharacterized genes with phenotypes quantitatively
similar to that of D-IAP1 (z score > 5), raising the possibility
that these loss-of-function phenotypes resulted from cell death, perhaps
due to the activation of apoptosis. We further evaluated two such genes,
CG11700,
a ubiquitin-like gene, and CG15455, a gene encoding an AML1-like
transcription factor (z scores 7.2 and 7.4 in Kc167 cells,
respectively, Fig. 3D). The phenotypic severity could
not be attributed to an accumulated arrest in transition at one stage in
the cell cycle (12) (Fig. 4A and
fig.
S5). As indicated by terminal transferase-labeled DNA breaks, over
95% of cells treated with dsRNA to
CG11700 or D-IAP1, and
20% of cells treated with dsRNA to
CG15455, were apoptotic (12)
(Fig. 4B). The addition of a pan-caspase inhibitor, Z-Val-Ala-DL-Asp(O-Methyl)-fluoro-methylketone
(z-VAD-fmk), reverted the cell death in response to the RNAi of CG11700
and D-IAP1, and to a lesser extent of CG15455 and other transcription
factors (Fig. 4C). D-IAP1 directly inhibits the proapoptotic
caspase, Nc (Nedd2-like) (20). The CG11700 and
D-IAP1
dsRNA-induced cell death phenotypes were both rescued by combined RNAi
removing the single Nc caspase function (Fig. 4C).
In contrast, neither the loss of function of Nc (Fig.
4) nor the loss of function of the transcriptionally activated proapoptotic
gene reaper (19, 21, 22) (fig.
S6) was sufficient to suppress cell death upon co-RNAi with CG15455
or other tested transcription factors. Together, these results suggest
that the ubiquitin-like CG11700 may act in the same pathway as D-IAP1
to directly prevent Nc caspase-activated apoptotic cell death. In contrast,
a set of essential transcription factors may regulate complex responses
for cell fate, proliferation, and/or cell survival that directly or indirectly
initiate a partially caspase-dependent apoptotic program.
Fig. 4. Different anti-apoptotic gene functions identified by
severe RNAi viability phenotypes. Experiments shown as assayed in Kc167
cells following RNAi against a negative control (gfp) and D-IAP1,
CG11700
(ubiquitin-like), and CG15455 (AML-1-like, fig.
S4) genes, each identified in the screen by severe phenotypes (z
scores 7.0, 7.2, and 7.4, respectively).
(A) Flow cytometry analysis of propidium iodide (PI) stained DNA
after 3 days RNAi, as indicated (12). Analyses of total
events reveal decreased cell size and DNA content, indicative of dying
cells. Cell cycle distribution
analysis performed on viable cells with >2N DNA content [percentage
of total events shown, fig. S5
(12)]. FSC, forward scatter channel.
(B) Two classes of severe RNAi phenotypes were distinguished by the proportion of apoptotic cells (>95 and 20%), as indicated by fluorescein-labeled dsDNA breaks [terminal deoxynucleotidyl transferase–mediated deoxyuridine triphosphate nick end labeling (TUNEL), green and lower panel] versus total cell nuclei (Hoechst 33342 DNA stain, red) 7 days after treatment with dsRNAs.
(C) Left panel: Rescue of RNAi growth and viability phenotypes by a pan-caspase inhibitor (z-VAD-fmk). Shown are the ratios between combined treatments with dsRNA and either z-VAD-fmk in dimethyl sulfoxide (DMSO), or in DMSO alone (red line, normalized to 1), from results of averaged triplicate experiments. Right panel: Data are displayed as in the left panel, but they are from combined treatments with test dsRNA and either dsRNA against Nc caspase or gfp control.
In comparisons made between complete proteomes (12), the percentage of predicted orthologs found for the genes with RNAi viability phenotypes was higher than the percentage of orthologs found in searches of the entire Drosophila proteome with those from yeast, worm, mosquito, mouse, and human (fig. S6 and table S5). Notably, 50 genes had homology to human disease genes (table S6), including 10 genes implicated in blood-cell leukemia (e.g., AML1) and genes described with anti-apoptotic functions (FOXOA1 and MLK). Thus, functional analysis in Drosophila cells uncovered common key regulators for animal cell survival and proliferative decisions. Interestingly, in contrast to the total results, the most severe RNAi phenotypes (z score > 5, fig. S8) identified significantly fewer yeast homologs (from 39 to 19.3%, respectively), a similar percentage of animal-specific homologs (27.6 and 29.8%), and an increased number of genes without high-scoring matches (from 33.3 to 50.9%) (12). This suggests that metazoans may have evolved specific mechanisms, such as the preservation of cell identity by a specific code of transcription factors, to maintain cell viability.
Functional analysis by RNAi reveals previously unknown and evolutionarily conserved gene functions, with the powerful ability to comprehensively and quantitatively determine the contribution of potentially every gene to a particular process. Quantitative cell-based analysis offers advantages by permitting the detection of gene functions associated with subtle or redundant phenotypes in organisms. This approach also holds the potential for statistical clustering across many different cellular phenotypes to elucidate complex gene functions as more data accumulate (23). The described genome-wide RNAi library is adaptable for screening for many different cellular pathways and processes, ultimately leading to a functional understanding of cellular systems that control development and disease.
References and Notes
1. S. A. Chervitz et al., Science 282, 2022 (1998).
2. T. R. Golub et al., Science 286, 531 (1999).
3. G. Giaever et al., Nature 418, 387 (2002).
4. R. S. Kamath et al., Nature 421, 231 (2003).
5. J. C. Clemens et al., Proc. Natl. Acad. Sci. U.S.A.
97, 6499 (2000).
6. M. Ramet, P. Manfruelli, A. Pearson, B. Mathey-Prevot,
R. A. Ezekowitz, Nature 416, 644 (2002).
7. M. P. Somma, B. Fasulo, G. Cenci, E. Cundari, M.
Gatti, Mol. Biol. Cell 13, 2448 (2002).
8. L. Lum et al., Science 299, 2039 (2003).
9. A. Kiger et al., J. Biol. 2, 27 (2003).
10. M. Guo, B. A. Hay, Curr. Opin. Cell Biol. 11, 745 (1999).
11. M. Hild et al., Genome Biol. 5, R3 (2003).
12. Materials and methods are available as supporting
material on Science Online. Complete protocols and data sets are also
provided on http://drsc.med.harvard.edu/viability
13. B. A. Hay, D. A. Wassarman, G. M. Rubin, Cell 83, 1253
(1995).
14. G. Echalier, A. Ohanessian, In Vitro 6, 162 (1970).
15. S. Yanagawa, J. S. Lee, A. Ishimoto, J. Biol. Chem. 273,
32353 (1998).
16. S. Misra et al., Genome Biol. 3, RESEARCH0083 (2002).
17. FlyBase Consortium, Nucleic Acids Res. 31, 172 (2003).
FlyBase, a database of the Drosophila genome, is available at http://www.flybase.org
18. N. J. Mulder et al., Nucleic Acids Res. 31, 315 (2003).
InterPro, a database of protein families, domains, and functional sites,
is available at
http://www.ebi.ac.uk/interpro
19. L. H. Frank, C. Rushlow, Development 122, 1343 (1996).
20. I. Muro, B. A. Hay, R. J. Clem, J. Biol. Chem. 277, 49644
(2002).
21. K. White, E. Tahaoglu, H. Steller, Science 271, 805 (1996).
22. M. Boutros et al., data not shown.
23. F. Piano et al., Curr. Biol. 12, 1959 (2002).
24. We thank T. Mitchison and the Institute of Chemistry
and Cell Biology for advice on cell-based screens and usage of equipment;
L. Hrdlicka and S. Hagar for excellent technical support; R. Steen for
use of equipment; L. Kockel, B. Gelbart, and D. Emmert for constructive
discussions; and G. Rubin, T. Ingolia, P. Leder, T. Mitchison, A. McMahon,
L. Perkins, R. Tanis, M. Vincent, and B. Ward for support at various stages
of this project. Supported by HHMI, as well as a gift from M. Crowinshield
and additional support from Harvard Medical School. Work in R.P. laboratory
was supported by the German Human Genome Project (DHGP). M.B. was supported
by an Emmy-Noether grant from the Deutsche Forschungsgemeinschaft. A.K.
was supported by The Jane Coffin Childs Memorial Fund for Medical Research.
Supporting Online Material:
http://drsc.med.harvard.edu/viability/
or:
http://www.sciencemag.org/cgi/content/full/303/5659/832/DC1
Materials and Methods, SOM Text, Figs. S1 to S8, Tables
S1 to S8:
http://drsc.med.harvard.edu/viability/
Table S1 :
http://drsc.med.harvard.edu/viability/1091266tabS1.pdf
1. Persengiev SP, Zhu X and Green MR, "Nonspecific, concentration-dependent stimulation and repression of mammalian gene expression by small interfering RNAs (siRNAs)", RNA, vol. 10, no. 1, pp. 12-18 (January, 2004).
2. Geiss G, Jin G, Guo J, Bumgarner R, Katze MG, and Sen GC, "A Comprehensive View of Regulation of Gene Expression by Double-Stranded RNA-Mediated Cell Signaling", J. Biol. Chem. vol. 276, pp. 30178-30182 (2001).
Additional References for RNA in gene imprinting:
1. Dallosso AR, Hancock AL, Brown KW, Williams AC, Jackson S, and Malik K, "Genomic imprinting at the WT1 gene involves a novel coding transcript (AWT1) that shows deregulation in Wilms' tumours".
2. Sleutels F, Zwart R, and Barlow DP, "The non-coding Air RNA is required for silencing autosomal imprinted genes".
3. Nikaido I, Saito C, Wakamoto A, Tomaru Y, Arakawa T, Hayashizaki Y, and Okazaki Y, "EICO (Expression-based Imprint Candidate Organizer): finding disease-related imprinted genes".
4. Han M-H, Goud S, Song L, and Fedoroff N, "The Arabidopsis double-stranded RNA-binding protein HYL1 plays a role in microRNA-mediated gene regulation".
5. Lai EC, Wiel C, and Rubin GM, "Complementary miRNA pairs suggest a regulatory role for miRNA:miRNA duplexes".
6. Sen G, Wehrman TS, Myers JW, and Blau HM, "Restriction enzyme-generated siRNA (REGS) vectors and libraries".
Further Topics in: Euchromatin, active DNA, and RNA ribo-regulators:
Reviews and Research:
Links to
Euchromatin Activator RNA Reviews:
Links to
Euchromatin Activator RNA Research:
Links to Ultrastructural
Probes of DNase I-Sensitive Sites:
Links to
RNA as a Therapeutic Agent:
Links to Hodgkin Lymphoma
Immuno-Pathology:
Links to Activated
T-Lymphocyte Immunotherapy:
Links to Medical
Systems Biology:
"Ultrastructural Probes of Active DNA Sites, and the RNA Activators of DNA".
For Further Information and Feedback:
E-mail: frenster@euchromatin.net