Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Aug;25(8):2848-63.
doi: 10.1105/tpc.113.112805. Epub 2013 Aug 16.

A Genomic-Scale Artificial microRNA Library as a Tool to Investigate the Functionally Redundant Gene Space in Arabidopsis

Affiliations
Free PMC article

A Genomic-Scale Artificial microRNA Library as a Tool to Investigate the Functionally Redundant Gene Space in Arabidopsis

Felix Hauser et al. Plant Cell. .
Free PMC article

Abstract

Traditional forward genetic screens are limited in the identification of homologous genes with overlapping functions. Here, we report the analyses and assembly of genome-wide protein family definitions that comprise the largest estimate for the potentially redundant gene space in Arabidopsis thaliana. On this basis, a computational design of genome-wide family-specific artificial microRNAs (amiRNAs) was performed using high-performance computing resources. The amiRNA designs are searchable online (http://phantomdb.ucsd.edu). A computationally derived library of 22,000 amiRNAs was synthesized in 10 sublibraries of 1505 to 4082 amiRNAs, each targeting defined functional protein classes. For example, 2964 amiRNAs target annotated DNA and RNA binding protein families and 1777 target transporter proteins, and another sublibrary targets proteins of unknown function. To evaluate the potential of an amiRNA-based screen, we tested 122 amiRNAs targeting transcription factor, protein kinase, and protein phosphatase families. Several amiRNA lines showed morphological phenotypes, either comparable to known phenotypes of single and double/triple mutants or caused by overexpression of microRNAs. Moreover, novel morphological and abscisic acid-insensitive seed germination mutants were identified for amiRNAs targeting zinc finger homeodomain transcription factors and mitogen-activated protein kinase kinase kinases, respectively. These resources provide an approach for genome-wide genetic screens of the functionally redundant gene space in Arabidopsis.

Figures

Figure 1.
Figure 1.
Analysis of Publicly Available Family Definitions for the Proteome of Arabidopsis. (A) Venn diagram of protein-coding loci in the three largest family definitions (Phytozome, PIRSF, and PFAM). (B) The bar graph shows the relative distribution of family size in the Phytozome, PIRSF, and PFAM family definitions (100% is the total number of families in the individual family definition). The families are categorized in six groups according to their size. (C) Heat map of a comparative analysis of family definitions. Note that all of the 12 depicted gene family definitions contribute additional gene families that were included for the genome-wide amiRNA design. The color scale represents the relative similarity between the family definitions (see Supplemental Figure 1 and Supplemental Tables 1 and 2 online for further details). (D) Integrated view of 12 genome-wide Arabidopsis gene family definitions (red circles) with their associated gene families (green circles). Gray lines (edges) connect the 12 family definitions with their associated families. Gray lines also connect families that share members.
Figure 2.
Figure 2.
Design of a Genome-Wide AmiRNA Collection Targeting Gene Families. (A) Simplified schematic illustration of the workflow used for the computational iterative design of family-targeting amiRNAs. Starting with the analysis of family definitions, the amiRNA design was performed using whole families and using subclusters for increasing the number of targets per family. The resulting amiRNAs were assembled in a Web-accessible database. A selection of 22,000 amiRNAs was chosen for synthesis (see Results for details). (B) Schematic representation of the relation between target classes, amiRNAs that target specific genes, and the targeted genes. Shown are two target classes that target two (left) or three (right) genes (targets). The target class is a term for a group of genes targeted by one or several amiRNAs. (C) Overview of the total number of genes in the relevant groups (x axis from left to right): all annotated loci in the genome of Arabidopsis, all protein coding loci, all loci in the family definitions analyzed (see Supplemental Table 1 online), all loci targeted by any designed amiRNA, and all loci targeted by the 10 PHANTOM sublibraries. (D) Size distribution of the functional classified amiRNA sublibrary pools represented in the designed PHANTOM library. TFB, transcription factors and other RNA and DNA binding proteins; PKR, protein kinases, protein phosphatases, receptors, and their ligands; HEC, hydrolytic enzymes (enzyme classification [EC] class 3), excluding protein phosphatases; CSI, proteins that form or interact with protein complexes including stabilization of those; TEC, metabolic and other enzymes catalyzing transfer reactions (EC class 2); PEC, catalytic active proteins, mainly enzymes; BNO, proteins binding small molecules; TRP, proteins that transport organic and inorganic molecules across membranes; DMF, protein with diverse functional annotation not found in the other categories; UNC, genes for which the function is not known or cannot be inferred.
Figure 3.
Figure 3.
Visualization of Relationships between AmiRNAs (Green) and Targeted Genes (Red). (A) The network shown visualizes the largest connected unit in the PHANTOM library. The library assembly procedure maximizes the number of unique groups of targeted genes. Thus, the connectivity and complexity of the network is high. The target genes are part of the Kinomer TKL (Tyr kinase-like kinases) family, by far the largest group of kinases in land plants (Martin et al., 2009). AmiRNAs (green circles, nodes) are connected by edges (gray lines) to their target genes (red circles, nodes). Green circles represent one amiRNA sequence. (B) The network shown visualizes the connected unit in an alternative assembly procedure, which was tested for an initial PHANTOM library design but not pursued further (see Results). This alternative assembly procedure maximizes the number of amiRNAs with a minimal set of groups of targeted genes. This reduces the connectivity and complexity of the network. The target genes are part of the AP2 transcription factor family (Kim et al., 2006). AmiRNAs (green circles, nodes) are connected by edges (gray lines) to their target genes (red circles, nodes). Green circles represent here more than one amiRNA sequence. (C) Example of a typical target class target gene network in the PHANTOM amiRNA library representing members of the NAC transcription factor family. Orange rectangles represent target genes, while green rectangles represent amiRNAs labeled by an identifier. Note that most gene pairs are targeted by two or more amiRNAs, and each amiRNA targets a separate group of genes, thus enhancing robustness and gene combinations for screening.
Figure 4.
Figure 4.
Phenotypes of AmiRNA Lines Obtained with the AmiRNAs of the TPK Library, Which Are Related to Known Morphological Phenotypes. (A) Rosette, leaves, and inflorescence of a representative untransformed wild-type control (Col-0 pRAB18:GFP) plant. (B) Rosette, leaves, and inflorescence of a representative line transformed with amiRNA-ARF (predicted targets: ARF1, ARF3, ARF4, ARF5, ARF6, and ARF8; see Supplemental Table 5 online) with similar phenotype to miRNA167 overexpression (Wu et al., 2006) or arf6 arf8 double mutants (Nagpal et al., 2005). (C) Rosette and whole plant of a representative line transformed with amiRNA-SBP (predicted targets: SPL2, SPL3, SPL4, SPL6, SPL9, SPL10, SPL11, SPL13B, SPL13A, and SPL15; see Supplemental Table 5 online) with similar phenotype to miRNA 156b overexpression plants (Schwab et al., 2005). (D) Rosette, leaves, and inflorescence of a representative line transformed with amiRNA-TCP (TB1, CYC, and PCF family; predicted targets: TCP2, TCP4, TCP10, TCP13, and TCP24; see Supplemental Table 5 online) with similar phenotype to miRNA319a overexpression plants (Palatnik et al., 2003). (E) Rosette and inflorescence of a representative line transformed with amiRNA-MADS (MCM1, AGAMOUS, DEFICIENS, SRF family; predicted targets: AG, AGL3, AGL6, AGL7, AGL10, AGL13, AGL15, AGL25, and AGL72; see Supplemental Table 5 online) with similar phenotype to agamous alleles (Bowman et al., 1991). (F) Rosette and inflorescence of a representative line transformed with amiRNA-C2C2-CO-like [C(2)-C(2) zinc finger constans like; predicted targets: CO, COL1, COL2, COL4, and COL5; see Supplemental Table 5 online] with similar phenotypes to CONSTANS alleles (Koornneef et al., 1991). (G) Whole seedling of three independent transformants obtained with amiRNA-HB-1 (predicted targets: AT1G05230, ATML1, HDG11, HDG3, HDG9, and PDF2; see Supplemental Table 5 online). Comparable to pdf2 atml1 double mutants (Abe et al., 2003), these plants also do not develop further than shown in the image (see Supplemental Tables 5 and 6 online for details). (H) Rosette, leaves, inflorescence, and aerial rosette of a representative line transformed with amiRNA-HB-2 (predicted targets: BEL1, BLH2, BLH3, BLH4, BLH5, PNF, ATH1, BLH11, and PETB; see Supplemental Table 5 online). While the leaf borders share similarity with a sawtooth-1 sawtooth-2 double mutant (Kumar et al., 2007), the internode patterning of the siliques and the aerial rosette resemble ath1-1 pnf pny triple mutants (Rutjens et al., 2009) thereby possibly showing additive effects of the targets. (I) Rosette and whole plant of a representative line transformed with amiRNA-bZIP (basic region/Leu zipper motif transcription factor; predicted targets: PAN, TGA1, TGA1, TGA2, TGA4, TGA5, and TGA6; see Supplemental Table 5 online). The plant leaves seem to be paler, and the distance between the last cauline leaf and the flower is shortened compared with wild-type plants (white arrow marks inflorescence shown in the top left of the panel).
Figure 5.
Figure 5.
Phenotypes of AmiRNA Lines Obtained with the AmiRNAs of the TPK Library, Which Show Gene Family–Linked Phenotypes Not Previously Described (See Supplemental Tables 5 and 6 Online for Details). (A) Rosette, leaves, whole plant, and inflorescence of a representative line transformed with amiRNA-zfHD (predicted targets: ATHB22, ATHB23, ATHB25, ATHB26, ATHB27, ATHB29, ATHB30, ATHB33, and ATHB34; see Supplemental Table 5 online). The larger number of stems, larger number of flowers, and short siliques are typical for this amiRNA. (B) Whole plant of a representative line transformed with amiRNA-M3K (predicted targets: AtMAP3Kδ1, AtMAP3Kδ3, AtMAP3Kδ5, AtMAP3Kθ1, AtMAP3Kθ2, AT1G73660, and AT1G18160; see Supplemental Table 5 online). In general, those plants were smaller than wild-type plants of the same age. (C) Images of seeds germinating on plates containing 2 μM ABA at the indicated time points. Shown are representative lines of an amiRNA line targeting a set of seven M3Ks (amiRNA-M3K; top row). Wild-type control amiRNA targeting human myosin 2 (amiRNA-HsMYO, no target in Arabidopsis; middle row) and one amiRNA line targeting ABI5 (amiRNA-ABI5) as reference for insensitivity is shown in the other two rows. (D) Average ratio of amiRNA-zfHD target gene expression levels in the wild type versus amiRNA-zfHD expressing plants as determined by quantitative real-time PCR (n = 4; the bars represent average ± sd). (E) Average ratio of amiRNA-M3K target gene expression levels of amiRNA-M3K–expressing plants compared with the wild type as determined by quantitative real-time PCR (n = 4; the bars represent average ± sd). (F) Time course of hypocotyl emergence in the presence of ABA for wild-type control (two lines; amiRNA targeting human myosin 2, no target in Arabidopsis; amiRNA-HsMYO), one amiRNA targeting ABI5 (four lines amiRNA-ABI5) as a reference for insensitivity, and amiRNA lines targeting seven mitogen-activated kinases (five lines). The bars represent average ± se for each amiRNA in percentage relative to the total number of seeds. (G) Time course of germination in the presence of ABA for amiRNA lines targeting seven mitogen-activated kinases (amiRNA-M3K; five lines), one amiRNA targeting ABI5 (amiRNA-ABI5; four lines) as a reference for insensitivity and wild-type control (amiRNA targeting human myosin 2 [amiRNA-HsMYO], no target in Arabidopsis, two lines). The bars represent average ± se for each amiRNA in percentage relative to the total number of seeds.

Similar articles

See all similar articles

Cited by 13 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback