A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics

Front Genet. 2017 Nov 1;8:168. doi: 10.3389/fgene.2017.00168. eCollection 2017.


Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose-response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration-response points of departure. The methods are extensible to other forms of concentration-response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state.

Keywords: bioinformatics & computational biology; bioinformatics-pipeline; cardiomyocytes; dose–response modeling; expression profiling; expression-based dose–response modeling; iPSCs; toxicogenomics.