Understanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing (scRNAseq) captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for scRNAseq in budding yeast (Saccharomyces cerevisiae). We pooled diverse transcriptionally barcoded gene deletion mutants in 11 different environmental conditions and determined their expression state by sequencing 38,285 individual cells. We benchmarked a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12,228 interactions.
Keywords: S. cerevisiae; computational biology; gene regulatory networks; single cell RNA sequencing; systems biology; transcription factors.
Organisms switch their genes on and off to adapt to changing environments. This takes place thanks to complex networks of regulators that control which genes are actively ‘read’ by the cell to create the RNA molecules that are needed at the time. Piecing together these networks is key to fully understand the inner workings of living organisms, and how to potentially modify or artificially create them. Single-cell RNA sequencing is a powerful new tool that can measure which genes are turned on (or ‘expressed’) in an individual cell. Datasets with millions of gene expression profiles for individual cells now exist for organisms such as mice or humans. Yet, it is difficult to use these data to reconstruct networks of regulators; this is partly because scientists are not sure if the computational methods normally used to build these networks also work for single-cell RNA sequencing data. One way to check if this is the case is to use the methods on single-cell datasets from organisms where the networks of regulators are already known, and check whether the computational tools help to reach the same conclusion. Unfortunately, the regulatory networks in the organisms for which scientists have a lot of single-cell RNA sequencing data are still poorly known. There are living beings in which the networks are well characterised – such as yeast – but it has been difficult to do single-cell sequencing in them at the scale seen in other organisms. Jackson, Castro et al. first adapted a system for single-cell sequencing so that it would work in yeast. This generated a gene expression dataset of over 40,000 yeast cells. They then used a computational method (called the Inferelator) on these data to construct networks of regulators, and the results showed that the method performed well. This allowed Jackson, Castro et al. to start mapping how different networks connect, for example those that control the response to the environment and cell division. This is one of the benefits of single-cell RNA methods: cell division for example is not a process that can be examined at the level of a population, since the cells may all be at different life stages. In the future, the dataset will also be useful to scientists to benchmark a variety of single cell computational tools.
© 2020, Jackson et al.