Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep;89(9):1599-618.
doi: 10.1007/s00204-015-1573-y. Epub 2015 Aug 14.

A Transcriptome-Based Classifier to Identify Developmental Toxicants by Stem Cell Testing: Design, Validation and Optimization for Histone Deacetylase Inhibitors

Affiliations
Free PMC article

A Transcriptome-Based Classifier to Identify Developmental Toxicants by Stem Cell Testing: Design, Validation and Optimization for Histone Deacetylase Inhibitors

Eugen Rempel et al. Arch Toxicol. .
Free PMC article

Abstract

Test systems to identify developmental toxicants are urgently needed. A combination of human stem cell technology and transcriptome analysis was to provide a proof of concept that toxicants with a related mode of action can be identified and grouped for read-across. We chose a test system of developmental toxicity, related to the generation of neuroectoderm from pluripotent stem cells (UKN1), and exposed cells for 6 days to the histone deacetylase inhibitors (HDACi) valproic acid, trichostatin A, vorinostat, belinostat, panobinostat and entinostat. To provide insight into their toxic action, we identified HDACi consensus genes, assigned them to superordinate biological processes and mapped them to a human transcription factor network constructed from hundreds of transcriptome data sets. We also tested a heterogeneous group of 'mercurials' (methylmercury, thimerosal, mercury(II)chloride, mercury(II)bromide, 4-chloromercuribenzoic acid, phenylmercuric acid). Microarray data were compared at the highest non-cytotoxic concentration for all 12 toxicants. A support vector machine (SVM)-based classifier predicted all HDACi correctly. For validation, the classifier was applied to legacy data sets of HDACi, and for each exposure situation, the SVM predictions correlated with the developmental toxicity. Finally, optimization of the classifier based on 100 probe sets showed that eight genes (F2RL2, TFAP2B, EDNRA, FOXD3, SIX3, MT1E, ETS1 and LHX2) are sufficient to separate HDACi from mercurials. Our data demonstrate how human stem cells and transcriptome analysis can be combined for mechanistic grouping and prediction of toxicants. Extension of this concept to mechanisms beyond HDACi would allow prediction of human developmental toxicity hazard of unknown compounds with the UKN1 test system.

Figures

Fig. 1
Fig. 1
Data structure of transcriptome changes triggered by histone deacetylase inhibitors (HDACi) and mercurials in human stem cells differentiating to neuroectoderm. Stem cells were differentiated towards neuroectodermal progenitor cells within 6 days of differentiation (DoD6) as indicated on top. a The highest non-cytotoxic concentration [corresponding to EC10(cytotoxicity)] of all test compounds was determined in a viability assay. This ‘benchmark concentration’ (BMC) was used for obtaining transcriptome data of HDACi and mercurials in this study. The BMC was calculated, based on concentration–response curves of three independent experiments. b EC50 data for inhibition of HDAC isoforms 1, 2, 4, 6 were retrieved from the literature (Khan et al. 2008). They are indicated by a black line, and the respective BMC in our study is indicated as red dot. c The data structure of all transcriptome data sets was dimensionality-reduced and presented in form of a 2D principle component analysis (PCA) diagram. Data show a typical batch effect (offset of controls) which segregated with measurements from different sets of biological samples. d The ComBat batch correction algorithm perfectly aligned the controls and led to compound-wise clustering on the PCA diagram. e Alternatively, respective control values were subtracted from treated samples. This simple manipulation also led to a satisfying batch correction and clustering of data points according to toxicants. Each point represents one experiment (=data from one microarray), and the colour coding (labelled in e for data in ce) indicates the compound used in the experiment; panobino, panobinostat (colour figure online)
Fig. 2
Fig. 2
Characterization of transcriptional changes induced by HDACi and mercurials. a Differentiating cells were treated for 6 days by toxicants (four samples per compound; as in Fig. 1) before RNA was prepared and gene expression was measured on Affymetrix microarrays. The 50 genes with highest variance between all samples were selected for clustering (=clustering set). Then, all samples were clustered (Euclidean distance) on the basis of gene expression values for this set. The results are represented as heatmap with each row representing one gene, and the colour of each square indicating the absolute gene expression level (blue low; green middle; yellow high). b Number of differentially expressed genes (DEG) after exposure to toxicants compared to untreated controls (detailed data are shown in supplemental material). c Recombinant active caspase-3 was incubated for 30 min with respective mercurials at indicated concentrations. Then, the enzymatic activity was determined by a fluorometric assay. The caspase activity is represented in percentage relative to untreated control enzyme. The BMC of the respective mercurial (used in this study for microarray analysis) is indicated by a red line; data are mean ± SEM; n = 3; panobino, panobinostat (colour figure online)
Fig. 3
Fig. 3
Characterization of the HDACi consensus transcriptome effect in neurally differentiating stem cells. Differentiating cells were treated as indicated in Fig. 1 and used for whole transcriptome analysis. From the differentially expressed genes (DEG), we identified 405 up-regulated and 190 down-regulated ‘consensus genes’, each of them regulated by at least four HDACi (cut-off FC > 1.5). For each consensus gene, the mean fold change (FC) of all 6 HDACi was calculated and used for further analysis (detailed data are shown in supplemental material). a The top 20 up- and down-regulated consensus genes are displayed. b The gene ontology (GO) categories overrepresented amongst up- and down-regulated consensus genes (p < 0.05) were classified into 7 superordinate cell biological processes: neuro(nal development), mesoderm(al development), general development, migration/adhesion, neural crest, general cellular function/signalling and uncategorized and presented as ring diagram to visualize the relative distribution. The number of GO categories in each group is indicated. c The top 30 up- and down-regulated consensus genes were classified into 7 superordinate cell biological processes. d KEGG pathways overrepresented amongst consensus genes were identified and the five with the lowest p values (all with p < 0.02) are displayed. The numbers of total genes and the numbers of HDACi consensus genes are shown for each pathway
Fig. 4
Fig. 4
Detection and visualization transcription factor (TF) networks affected by HDACi. a The CellNet database (3297 microarray sets from all major tissues) was used to construct a generic human TF network, based on statistical co-expression information and graph-theoretical design principles. Each node represents a TF gene, and each edge suggests co-regulation. The edge length is driven by the number of edges on neighbouring nodes, not by the strength of co-regulation. Nodes are placed according to the Fruchterman–Reingold algorithm. Clusters (coded by same colours) were defined by an optimization algorithm that tries to maximize the modularity of the division of the graph into clusters. Then, GO term overrepresentation analysis was performed for each cluster to identify its biological role, and naming of the 18 clusters is based on these findings. Nodes (orange) at the rim of the network displayed in orange have not been assigned to define clusters. b The set of genes significantly up-regulated on DoD6 versus DoD0 (p < 0.05; FC ≥ 2.0) was retrieved from Balmer et al. (2014), and the TFs of this gene set were marked (red dots) in the TF network (see large, scalable version in supplemental material). c All TFs were identified amongst the HDACi consensus genes and marked in the TF network (blue down-regulated; red up-regulated). This diagram indicates, together with information from (a), which parts of the TF network are affected by at least 4 of the 6 HDACi used here. The clusters ‘forebrain development’ and ‘neuronal development’ have been encircled for better visualization in b and c (colour figure online)
Fig. 5
Fig. 5
Design of a transcriptome-based classifier to identify HDACi. The scheme on top illustrates the setup of a support vector machine (SVM)-based classifier for HDACi. The numbers denote data sets for the 12 toxicants used in this study. Colours denote a grouping in mercurials (blue) and HDACi (orange). Below, the principles of leave-one-out (left) and leave-two-out classification are shown. For instance, when chemical-4 is ‘left out’, this means that the other 11 compounds are used to build a classifier according to the rules specified above. Then, it is tested, how well the classification applies to chemical-4. For leave-two-out, the procedure is similar, in that a classifier is built from 10 remaining chemicals to predict one of the compounds left out (e.g. chemical-8). a The classifier was validated by a leave-one-out procedure. The calculated probabilities for a toxicant to be an HDACi are shown for each of the four replicate samples, and the overall prediction is shown in the last column. b Validation of the SVM classifier by a leave-two-out procedure. The rows indicate which compound was left out in addition to the predicted one. The probabilities (prediction) to be an HDACi are presented (for the 144 combinations) as mean of four independent experiments in a cross table. Probabilities of >0.5 predict for a compound to be an HDACi (red) and <0.5 predict for a mercurial (blue). Incorrect predictions are indicated by a red frame. c The SVM-based classifier was used for leave-one-out, leave-two-out, leave-3-out and leave-4-out prediction of belinostat being an HDACi. Predictions were performed for each of the four replicates, and each prediction is represented by a single dot. To demonstrate the role of entinostat for the correctness of the prediction, cases in which entinostat was amongst the left-out compounds are marked in red. Panobino, Panobinostat (colour figure online)
Fig. 6
Fig. 6
Validation of the transcriptome-based classifier to identify HDACi. a Differentiating cells were treated as indicated in Fig. 1 and transcriptome changes of neurally differentiating stem cells induced by HDACi and mercurials are plotted in a PCA (as in Fig. 1e) together with samples treated with 25, 150, 350, 450, 550, 650, 800 and 1 mM valproic acid (VPA) obtained from Waldmann et al. (2014). Each point represents one experiment (=data from one microarray), and the colour coding indicates the compound used in the experiment, mercurials (blue shades), HDACi (red shades) and VPA legacy data (green). The four samples from the present study (VPA classifier) have been encircled for better visualisation. The purple arrow indicates the track of transcriptional changes after exposure to increasing concentrations of VPA in the Waldmann et al. (2014) data set. The SVM classifier was applied to this (green) data set, and the prediction of VPA, at indicated concentrations (25 µM–1 mM) acting on stem cell differentiation like an HDACi, is shown in the table as a mean of four replicate samples. The lower row of the table indicates whether the respective sample triggered developmental toxicity (+) or not (−), according to Waldmann et al. (2014). b The diagram shows various schedules of drug exposure. Grey bars indicate the period of drug exposure with 600 µM VPA or 10 nM TSA, and white open bars indicate culture periods in medium without HDACi. The samples were analysed at the times indicated. Exposures of a limited duration relative to the overall experiment were termed ‘pulsed’ treatments, and these were distinguished as early, medium and late pulse according to the exposure scheme. c, d The tables indicate the calculated probability of VPA or TSA acting like an HDACi when used as described in b. Probabilities >0.5 are defined as HDACi classification (green), and p < 0.5 indicates that the experimental condition did not show a canonical HDAC effect (colour figure online)
Fig. 7
Fig. 7
Establishment of an optimized classifier based on 8 genes. a Initially, 100 probe sets (PS) were used for the SVM classifier. For further optimization, the number of PS was continuously reduced by one PS (selected randomly), and for each step, the proportion of correct prediction for a toxicant being an HDACi was calculated using the leave-one-out strategy (red dots) and leave-two-out strategy (black dots). The thresholds for ‘acceptable predictivity’ and ‘maximum predictivity’ are indicated by a dashed line. b The results of probability predictions for a toxicant being an HDACi determined by a 100-PS-based SVM classifier (as described in Fig. 3) were compared with a 10-PS-based SVM classifier (derived from a) in a correlation scatter plot. Note When an HDACi was predicted to be an HDACi, with p = 0.7, the data point was logged at 0.7. When a mercurial was predicted to be an HDACi, with 0.3, the data point was logged at 0.7 (with HDACi prediction as reference point). The results under leave-one-out conditions are presented as large dots and under leave-two-out conditions as small dots. For the three leave-two-out wrong predictions, the respective compound pairs are listed. c For the genes constituting the minimal HDACi classifier (10-PS, corresponding to 8 genes), function, role and regulation (mean fold change (FC) of all 6 HDACi) are listed (for references see suppl. Fig. S7). d The changes in expression of the 8 HDACi classifier genes (from the 10-PS classifier) induced by HDACi (red) and mercurials (blue) are graphed (colour figure online)

Similar articles

See all similar articles

Cited by 19 articles

See all "Cited by" articles

References

    1. Alexa ARJ (2010) topGO: enrichment analysis for gene ontology. R package 2.14.10. http://bioconductor.wustl.edu/bioc/html/topGO.html
    1. Bahr GF, Moberger G. Methyl-mercury-chloride as a specific reagent for protein-bound sulfhydryl groups; electron stains II. Exp Cell Res. 1954;6(2):506–518. doi: 10.1016/0014-4827(54)90199-8. - DOI - PubMed
    1. Balmer NV, Leist M. Epigenetics and transcriptomics to detect adverse drug effects in model systems of human development. Basic Clin Pharmacol Toxicol. 2014;115(1):59–68. doi: 10.1111/bcpt.12203. - DOI - PubMed
    1. Balmer NV, Weng MK, Zimmer B, et al. Epigenetic changes and disturbed neural development in a human embryonic stem cell-based model relating to the fetal valproate syndrome. Hum Mol Genet. 2012;21(18):4104–4114. doi: 10.1093/hmg/dds239. - DOI - PubMed
    1. Balmer NV, Klima S, Rempel E, et al. From transient transcriptome responses to disturbed neurodevelopment: role of histone acetylation and methylation as epigenetic switch between reversible and irreversible drug effects. Arch Toxicol. 2014;88(7):1451–1468. doi: 10.1007/s00204-014-1279-6. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources

Feedback