Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan;34(1):70-77.
doi: 10.1038/nbt.3419. Epub 2015 Dec 14.

Improving drug discovery with high-content phenotypic screens by systematic selection of reporter cell lines

Affiliations

Improving drug discovery with high-content phenotypic screens by systematic selection of reporter cell lines

Jungseog Kang et al. Nat Biotechnol. 2016 Jan.

Abstract

High-content, image-based screens enable the identification of compounds that induce cellular responses similar to those of known drugs but through different chemical structures or targets. A central challenge in designing phenotypic screens is choosing suitable imaging biomarkers. Here we present a method for systematically identifying optimal reporter cell lines for annotating compound libraries (ORACLs), whose phenotypic profiles most accurately classify a training set of known drugs. We generate a library of fluorescently tagged reporter cell lines, and let analytical criteria determine which among them--the ORACL--best classifies compounds into multiple, diverse drug classes. We demonstrate that an ORACL can functionally annotate large compound libraries across diverse drug classes in a single-pass screen and confirm high prediction accuracy by means of orthogonal, secondary validation assays. Our approach will increase the efficiency, scale and accuracy of phenotypic screens by maximizing their discriminatory power.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of method
Overview of image-based phenotypic screening steps: Libraries of compounds (left) are applied to cells labeled with biomarkers (middle); cellular responses are extracted from images and used to construct a phenotypic profile (right; cartooned in two dimensions as black points); and compounds are functionally classified (i.e. annotated) based on comparison to phenotypic profiles of known, reference drugs (colored circles). Overview of approach for selecting an Optimal Reporter cell line for Annotating Compounds Libraries, called an “ORACL.” We profile a collection of reference drugs using reporter cell lines labeled for diverse biomarkers. Our ORACL is defined as the reporter cell line whose phenotypic profiles give the highest classification accuracy of the reference drugs. We select this ORACL for large- scale phenotypic screens of unknown compound libraries.
Figure 2
Figure 2. An ORACL is identified that best distinguishes among drug classes
(a) A parent A549 cell line was built with a construct (pSeg) to express cytosolic (mCherry) and nuclear (H2B-CFP) fluorescent proteins to aid in automated cellular region identification. A library of diverse reporter cell lines were built from this parent line using a strategy (CD tagging) that randomly incorporated YFP into different proteins (one per reporter cell line). “Untagged” refers to the parental pSeg-tagged line that lacks a CD tag. (b) Left: Drug classification accuracies for each of our 93 CD-tagged reporters. Mean (black dots) and standard deviation (gray bar) of prediction accuracies were calculated from 100 cross-validation studies (Online Methods). Middle: drug-response profiles of the ORACL and a “mediocre” reporter cell line were visualized by MDS plot (top and bottom, tagged for XRCC5 or SEPT11 respectively). Each drug (or DMSO) profile is represented by a point and colored according to the drug classes. Right: Representative cellular response images for the indicated drugs in the MDS plots at left. The ORACL shows consistent phenotypes within drug classes, whereas the “mediocre” reporter cell line shows inconsistent phenotypes within the same drug classes. Fluorescent reporters: Blue: CFP-nuclear label; Red: mCherry-cytosolic label; Green: YFP-CD tag (intensity scale is the same for Blue and Red, but is adjusted for Green per reporter cell line). Scale bar: 10 μm. Drugs: i: Epithilone B; ii: Nocodazole; iii: Apicidin; iv: Oxamflatin; v: CPT; vi: Etoposide.
Figure 3
Figure 3. Compound hits across multiple drug classes are identified from a single-pass screen
(a) Shown are LDA projections of phenotypic profiles for reference drugs and compounds in batch 1 (NCI) and batch 2 (Prestwick and 8K). Profiles were computed by concatenating data from 24 and 48 hrs. Each point represents the projected profile for a tested compound and concentration. Reference drugs are colored according to drug classes. Hits and non-hits are shown as black or grey dots, respectively. (b) Summary of screen: proportion of primary (top) or secondary “high confidence” (middle) hits, and distribution of predicted drug classes for hits (bottom). DC: discriminant component.
Figure 4
Figure 4. Secondary studies validate predictions across diverse drug classes
Top: False discovery rates (FDR; y-axes) were calculated for 38 reference drugs (FDRRef, solid line) or 175 high-confidence hits (FDRhits, dashed line) at different thresholds for readouts selected in each validation assay (x-axes; Online Methods). Vertical dark gold dashed lines: readout thresholds at FDRRef = 0.1 (horizontal dark gold dash line). Middle: Readout values (x-axes) of DMSO, reference drugs (at five, 5-fold serial dilutions), and 175 high-confidence hits were shown for each validation experiment. Reference drugs were grouped according to drug classes; each line represents the dose response of one drug. Circle size reflects the concentrations (larger size indicates higher concentration). High-confidence hits that were predicted to belong to the class being validated were highlighted with corresponding colors. Bottom: representative images of cells treated with DMSO (−), positive control reference drugs (+), and secondary hits (?) indicated by black arrows in middle panel. Scale bar = 10 μm.
Figure 5
Figure 5. The ORACL can identify novel compound groupings
Compound clusters were identified by hierarchical clustering (see Online Methods). Colored dots correspond to reference drugs. Colored labels and lines indicate examples of clusters that contain multiple, consistently annotated compounds in drug classes not used in the selection of the ORACL.

Similar articles

Cited by

References

    1. van 't Veer LJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. - PubMed
    1. Thomas RK, et al. High-throughput oncogene mutation profiling in human cancer. Nature genetics. 2007;39:347–351. - PubMed
    1. Kolch W, Pitt A. Functional proteomics to dissect tyrosine kinase signalling pathways in cancer. Nature reviews. Cancer. 2010;10:618–629. - PubMed
    1. Griffin JL, Shockcor JP. Metabolic profiles of cancer cells. Nature reviews. Cancer. 2004;4:551–561. - PubMed
    1. Zhang J, Yang PL, Gray NS. Targeting cancer with small molecule kinase inhibitors. Nature reviews. Cancer. 2009;9:28–39. - PubMed

Publication types