Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 31;33(4):1058-1082.
doi: 10.1093/plcell/koab042.

Co-expression networks in Chlamydomonas reveal significant rhythmicity in batch cultures and empower gene function discovery

Affiliations
Free PMC article

Co-expression networks in Chlamydomonas reveal significant rhythmicity in batch cultures and empower gene function discovery

Patrice A Salomé et al. Plant Cell. .
Free PMC article

Abstract

The unicellular green alga Chlamydomonas reinhardtii is a choice reference system for the study of photosynthesis and chloroplast metabolism, cilium assembly and function, lipid and starch metabolism, and metal homeostasis. Despite decades of research, the functions of thousands of genes remain largely unknown, and new approaches are needed to categorically assign genes to cellular pathways. Growing collections of transcriptome and proteome data now allow a systematic approach based on integrative co-expression analysis. We used a dataset comprising 518 deep transcriptome samples derived from 58 independent experiments to identify potential co-expression relationships between genes. We visualized co-expression potential with the R package corrplot, to easily assess co-expression and anti-correlation between genes. We extracted several hundred high-confidence genes at the intersection of multiple curated lists involved in cilia, cell division, and photosynthesis, illustrating the power of our method. Surprisingly, Chlamydomonas experiments retained a significant rhythmic component across the transcriptome, suggesting an underappreciated variable during sample collection, even in samples collected in constant light. Our results therefore document substantial residual synchronization in batch cultures, contrary to assumptions of asynchrony. We provide step-by-step protocols for the analysis of co-expression across transcriptome data sets from Chlamydomonas and other species to help foster gene function discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Samples from the same experiment are strongly correlated. A), Correlation matrices between all samples using expression estimates for all 17,741 nuclear genes as FPKM. B), As in panel A, but after all normalization steps. In panels A and B, samples belonging to the same experiment are in consecutive order, and roughly in chronological order. C), Distribution of PCCs between (inter-expt, gray) and within (intra-expt, green) experiments. PCCs for all comparisons between experiments are shown as violin plots and box plots, alongside mean PCCs from all samples within each experiment, samples collected in the context of nitrogen deprivation (blue), PCCs for all metal-related samples (light purple) and specific metals (darker shades of purple), samples collected over a diurnal cycle (light orange), and PCC between subsets of samples (darker shades of orange). Values along the diagonal of the matrix (equal to 1) were discarded prior to plotting. D), Correlation matrix for samples from metal-related experiments, all from the Merchant laboratory, and in which either one micronutrient has been omitted from the growth medium (for deficiency conditions: copper Cu, iron Fe, manganese Mn, and zinc Zn) or a toxic metal was added to observe the effect on homeostasis (cadmium Cd and nickel Ni). E), Correlation matrix of samples collected over a diurnal cycle. The light- and dark-part of each sampling day is indicated on the left and bottom sides of the matrix as white and black bars, respectively. Four time courses are compared here (Panchy et al., 2014; Zones et al., 2015; Strenkert et al., 2019).
Figure 2
Figure 2
Correlations and anti-correlations between organellar energy producing systems. A), Correlation matrix of nucleus-encoded components of mitochondrial respiratory complexes, in the order defined by Zones et al. (2015). An asterisk after the name of a complex signifies that its dedicated assembly factors (one to two genes outside of complex 4) are shown last, after the complex components. B), Correlation matrix of chlorophyll and hemes biosynthesis genes. Genes have been ordered according to Zones et al. (2015). Pairs of homologous genes are indicated above the correlation matrix. C), Co-expression matrix of photosystem genes (in green) and tetrapyrroles biosynthetic genes (in blue). D), Comparison of co-expression profiles of chloroplast- and mitochrondrion-localized energy production systems. The respiratory complex matrix is redrawn from Supplemental Figure S9.E, Distribution of PCCs between groups of genes. The gray distribution is the genome-wide distribution of all PCCs between all gene pairs. photo., photosynthesis; tetra., tetrapyrroles; resp., respiration.
Figure 3
Figure 3
Confirmation of high-confidence cilium proteins based on co-expression of their encoding genes. A), Correlation matrix of structural constituents of the Chlamydomonas cilium, in the order defined by Zones et al. (2015). DRC, dynein regulatory complex; BBS, Bardet–Biedl syndrome protein complex; BUG, basal body upregulated after deflagellation; POC, proteome of centriole; IFT, intra-flagellar transport. B), Correlation matrix between genes belonging to CiliaCut (green) or encoding components identified in the cilium proteome (light purple; Pazour et al., 2005). The genes within each subset were subjected to hierarchical clustering (FPC method in corrplot). C), Venn diagram of the overlap between genes encoding putative components of the cilium proteome, CiliaCut, and the cilia and basal body. Note that the gene lists do not reflect co-expression here. D), Venn diagram of the overlap between genes encoding putative components of the cilium proteome, CiliaCut, and genes belonging to cilia-related co-expression modules (listed in Supplemental Table S3). E, Venn diagram of the overlap between genes encoding putative components of the cilia and basal body and genes belonging to cilia-related co-expression modules.
Figure 4
Figure 4
Co-expression between RPGs reflects the final location of the corresponding ribosomal proteins. A), Correlation matrix between RPGs (Supplemental Data Set S1) and their translation regulators, sorted by the subcellular localization of their encoded proteins. For each set of RPGs and their regulators, we followed the same gene order defined by Zones et al. (2015). B), Correlation matrix restricted to RPGs. Each set of RPGs was subjected to hierarchical clustering (FPC method in corrplot) to single out non-co-expressed genes. C), Distribution of PCCs between RPG gene pairs encoding large or small ribosome subunits. The gray distribution indicates the PCC distribution of all gene pairs for the Chlamydomonas genome. D), Distribution of PCCs for gene pairs belonging to distinct RPG groups. E), Correlation matrix for 357 RPGs (Supplemental Data Set S5) using the fully normalized dataset derived from Arabidopsis microarray experiments (Supplemental Data Set S6). “Nuclear” and “unclear” denote RPGs whose encoded proteins are predicted to localize to the nucleus or lack a clear localization, respectively.
Figure 5
Figure 5
Correlations between Chlamydomonas histone genes. A), Correlation matrix among Chlamydomonas histone genes, ordered according to their genomic coordinates, using RNA-seq data derived from poly(A)-selected samples. B), Same as (A), using RNA-seq data derived from ribodepleted samples. Histone genes that are not regulated by the cell cycle are indicated as “non-replication histones.” H1, histone H1 genes; HVs, histone variants. C), Distribution of PCCs for classes of histones genes shown in (A) and (B). Histone variants (HVs) are shown in light blue, replication-associated histones in purple, and non-replication histones in light green. D), Global clustering of histone genes in Chlamydomonas. All histone genes occur as divergent pairs and are oftentimes grouped as one representative of each major histone type (H2A, H2B, H3, and H4). The number to the left gives the number of instances of the given arrangement in the Chlamydomonas genome. E), Comparison of histone gene clustering in selected photosynthetic organisms. O. lucimarinus, Ostreococcus lucimarinus; D. salina, Dunaliella salina; V. carteri, Volvox carteri; C. zofingiensis, Chromochloris zofingiensis; M. polymorpha, Marchantia polymorpha; P. patens, Physcomitrium patens. The asterisk for Histone H2B genes in D. salina indicates that they are absent from the current annotation, but were identified by TBLASTN against the D. salina genome with Chlamydomonas histone H2B protein sequence as query.
Figure 6
Figure 6
Core cell division genes are coordinately and highly co-expressed. A), Correlation matrix of non-redundant cell division modules and correlation matrix of genes whose loss of function leads to cell division defects (Tulin and Cross, 2014; Breker et al., 2018). Genes within each set were ordered according to hierarchical clustering using the FPC method in corrplot. B–D), Co-expressed cohorts, shown as nested Venn diagrams, associated with genes from the cell division modules (B), the genetics list (C), or genes involved in DNA replication and chromosome segregation (manual list) (D) from networks N1–N3. E), Overlap between original gene lists related to cell division (modules, genetics, and manual lists). F), Correlation matrix of non-redundant cilia modules (modules) and genes belonging to CiliaCut only (CiliaCut), the cilium proteome and shared genes between CiliaCut and the cilium proteome (overlap). The color bars on the right refer to the color scheme used for co-expression cohorts in G–J. G–I), Co-expressed cohorts, shown as nested Venn diagrams, associated with genes from CiliaCut (G), the overlap between CiliaCut and the cilium proteome (H), and the cilium proteome (I) from networks N1–N3. J), Overlap between N1 cohorts associated with each initial gene list (CiliaCut, overlap, and cilium proteome). K), Correlation matrix of non-redundant photosynthesis modules, photosynthesis-related genes, and tetrapyrrole biosynthesis-related genes. L–N), Co-expressed cohorts, shown as nested Venn diagrams, associated with genes from the photosynthesis modules (L), photosynthesis-related genes (M), and tetrapyrrole biosynthesis-related genes (N) from networks N1–N3. O), Overlap between initial gene lists. P), Overlap between N1 cohorts associated with photosynthesis and tetrapyrrole biosynthesis. In panels C, D, M, and N, the asterisk indicates that the gene list was restricted to highly co-expressed genes, based on FPC clustering of the data.
Figure 7
Figure 7
Co-expression modules routinely comprise genes with similar diurnal phases. A), Schematic of the Chlamydomonas diurnal cycle in cell division events. B), Phase distribution of 10,294 high-confidence diurnally rhythmic genes, shown as a circular plot covering the full 24 h of a complete diurnal cycle. Gray shade indicates night. C), Co-expression modules with a high percentage of rhythmic genes exhibit a uniform diurnal phase. The light purple shade indicates the distribution of rhythmic modules. D–K), Example of phase distribution for co-expression modules and associated N1 co-expression cohorts.
Figure 8
Figure 8
Genes cluster based on their diurnal phase. A), Correlation matrix of the 17,741 Chlamydomonas nuclear genes, ordered based on clustering by the AOE method built into corrplot, using the fully normalized dataset RNAseq4, RNAseq4LD (consisting of RNA samples collected from cells grown under light-dark cycles), and RNAseq4LL (with all other RNA-seq samples) as input. B), Distribution of pairwise PCCs for all gene pairs using RNAseq4, RNAseq4LD, and RNAseq4LL as input. C), Scatterplot of diurnal phases from 10,294 high-confidence diurnally rhythmic genes, as a function of their order from AOE clustering, using RNAseq4, RNAseq4LD, and RNAseq4LL as input. We saved gene order following AOE clustering (from 1 to 17,741) and plotted the diurnal phase of the subset of 10,294 rhythmic genes (along the y-axis). D), Scatterplot of diurnal phases from 10,294 high-confidence diurnally rhythmic genes, ordered based on the AOE clustering method on RNAseq4 (y-axis) and RNAseq4LD or RNAseq4LL (x-axis).
Figure 9
Figure 9
Chlamydomonas cultures grown in constant light retain substantial rhythmicity. A), Heatmap representation of the molecular timetable approach, applied to two diurnal datasets: Strenkert et al. (2019) and Zones et al. (2015). B), Heatmap representation of the molecular timetable approach, applied to all remaining RNA-seq samples. In panels (A) and (B), each sample is represented as the mean expression of 20 phase marker genes (per h). In (A), diurnal samples are ordered from top to bottom. For (B), samples were subjected to hierarchical clustering while generating the heatmap in R. as: heatmap from an asynchronous sample, corresponding to the average expression of all rhythmic genes for each time point. C), Scatterplot of minimum and maximum normalized expression across all RNA-seq samples. Diurnal time courses are indicated by a gray shade. as: expected position of minima and maxima for a completely asynchronous sample. The samples are ordered by experiments, therefore consecutive data points belong to the same experiment. D), Peak and trough times largely occur 12 h apart. Scatterplot of all peak expression time (x-axis) and trough times (y-axis). E), Distribution of peak times across all RNA-seq samples.
Figure 10
Figure 10
Application of the molecular timetable method to independent RNA-seq experiments across algae. A), Reanalysis of a transcriptome dataset included in our initial RNA-seq data (Urzica et a., 2012b). We subjected FPKM values to log2 normalization, followed by normalization to the mean (obtained during the normalization steps that yielded RNAseq4). We then used the molecular timetable method to determine the rhythmic pattern of the samples (Chlamydomonas CC-4532 strain grown in Tris Acetate Phosphate (TAP) or Tris Phosphate (CO2) medium with 0.25, 1, or 20 µM FeEDTA). B), Molecular timetable method applied to V. carteri samples collected in duplicates from somatic or gonidial cells (Matt and Umen, 2018). C), Molecular timetable method applied to C. zofingiensis samples collected over 12 h after addition and removal of glucose (Roth et al., 2019). For (A), we used 960 highly rhythmic genes to draw the heatmap. For (B) and (C), we included all rhythmic genes with orthologs in V. cateri (B) or C. zofingiensis (C), after log2 normalization and normalization with the Chlamydomonas-derived gene means.
None

Similar articles

Cited by

References

    1. Aoki Y, Okamura Y, Ohta H, Kinoshita K, Obayashi T (2016) ALCOdb: gene coexpression database for microalgae. Plant Cell Physiol 57: e3. - PMC - PubMed
    1. Asfahl KL, Schuster M (2017) Social interactions in bacterial cell–cell signaling. FEMS Microbiol Rev 41: 92–107 - PubMed
    1. Baxter I (2020) We aren’t good at picking candidate genes, and it’s slowing us down. Curr Opin Plant Biol 54: 57–60 - PubMed
    1. Blaby-Haas CE, Castruita M, Fitz-Gibbon ST, Kropat J, Merchant SS (2016) Ni induces the CRR1-dependent regulon revealing overlap and distinction between hypoxia and Cu deficiency responses in Chlamydomonas reinhardtii. Metallomics 8: 679–691 - PMC - PubMed
    1. Blaby IK, Glaesener AG, Mettler T, Fitz-Gibbon ST, Gallaher SD, Liu B, Boyle NR, Kropat J, Stitt M, Johnson S, et al. (2013) Systems-level analysis of nitrogen starvation-induced modifications of carbon metabolism in a Chlamydomonas reinhardtii starchless mutant. Plant Cell 25: 4305–4323 - PMC - PubMed

Publication types