The Carcinogenome Project: In Vitro Gene Expression Profiling of Chemical Perturbations to Predict Long-Term Carcinogenicity

Environ Health Perspect. 2019 Apr;127(4):47002. doi: 10.1289/EHP3986.


Background: Most chemicals in commerce have not been evaluated for their carcinogenic potential. The de facto gold-standard approach to carcinogen testing adopts the 2-y rodent bioassay, a time-consuming and costly procedure. High-throughput in vitro assays are a promising alternative for addressing the limitations in carcinogen screening.

Objectives: We developed a screening process for predicting chemical carcinogenicity and genotoxicity and characterizing modes of actions (MoAs) using in vitro gene expression assays.

Methods: We generated a large toxicogenomics resource comprising [Formula: see text] expression profiles corresponding to 330 chemicals profiled in HepG2 (human hepatocellular carcinoma cell line) at multiple doses and replicates. Predictive models of carcinogenicity and genotoxicity were built using a random forest classifier. Differential pathway enrichment analysis was performed to identify pathways associated with carcinogen exposure. Signatures of carcinogenicity and genotoxicity were compared with external sources, including Drugmatrix and the Connectivity Map.

Results: Among profiles with sufficient bioactivity, our classifiers achieved 72.2% Area Under the ROC Curve (AUC) for predicting carcinogenicity and 82.3% AUC for predicting genotoxicity. Chemical bioactivity, as measured by the strength and reproducibility of the transcriptional response, was not significantly associated with long-term carcinogenicity in doses up to [Formula: see text]. However, sufficient bioactivity was necessary for a chemical to be used for prediction of carcinogenicity. Pathway enrichment analysis revealed pathways consistent with known pathways that drive cancer, including DNA damage and repair. The data is available at , and a portal for query and visualization of the results is accessible at .

Discussion: We demonstrated an in vitro screening approach using gene expression profiling to predict carcinogenicity and infer MoAs of chemical perturbations.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Carcinogenicity Tests / instrumentation
  • Carcinogenicity Tests / methods*
  • Carcinogens / toxicity*
  • DNA Damage
  • Gene Expression Profiling / instrumentation
  • Gene Expression Profiling / methods*
  • Hep G2 Cells
  • Humans
  • In Vitro Techniques / instrumentation
  • In Vitro Techniques / methods
  • ROC Curve
  • Toxicogenetics / methods*


  • Carcinogens