We develop a flexible and computationally efficient approach for analyzing high-throughput chemical genetic screens. In such screens, a library of genetic mutants is phenotyped in a large number of stresses. Typically, interactions between genes and stresses are detected by grouping the mutants and stresses into categories, and performing modified t-tests for each combination. This approach does not have a natural extension if mutants or stresses have quantitative or nonoverlapping annotations (e.g., if conditions have doses or a mutant falls into more than one category simultaneously). We develop a matrix linear model (MLM) framework that allows us to model relationships between mutants and conditions in a simple, yet flexible, multivariate framework. It encodes both categorical and continuous relationships to enhance detection of associations. We develop a fast estimation algorithm that takes advantage of the structure of MLMs. We evaluate our method's performance in simulations and in an Escherichia coli chemical genetic screen, comparing it with an existing univariate approach based on modified t-tests. We show that MLMs perform slightly better than the univariate approach when mutants and conditions are classified in nonoverlapping categories, and substantially better when conditions can be ordered in dosage categories. Therefore, it is an attractive alternative to current methods, and provides a computationally scalable framework for larger and complex chemical genetic screens. A Julia language implementation of MLMs and the code used for this paper are available at https://github.com/janewliang/GeneticScreen.jl and https://bitbucket.org/jwliang/mlm_gs_supplement, respectively.
Keywords: E. coli; chemical genetic screens; high-throughput data; linear models.
Copyright © 2019 by the Genetics Society of America.