Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Apr 16;17:164.
doi: 10.1186/s12859-016-1015-8.

BAGEL: A Computational Framework for Identifying Essential Genes From Pooled Library Screens

Free PMC article

BAGEL: A Computational Framework for Identifying Essential Genes From Pooled Library Screens

Traver Hart et al. BMC Bioinformatics. .
Free PMC article


Background: The adaptation of the CRISPR-Cas9 system to pooled library gene knockout screens in mammalian cells represents a major technological leap over RNA interference, the prior state of the art. New methods for analyzing the data and evaluating results are needed.

Results: We offer BAGEL (Bayesian Analysis of Gene EssentiaLity), a supervised learning method for analyzing gene knockout screens. Coupled with gold-standard reference sets of essential and nonessential genes, BAGEL offers significantly greater sensitivity than current methods, while computational optimizations reduce runtime by an order of magnitude.

Conclusions: Using BAGEL, we identify ~2000 fitness genes in pooled library knockout screens in human cell lines at 5 % FDR, a major advance over competing platforms. BAGEL shows high sensitivity and specificity even across screens performed by different labs using different libraries and reagents.

Keywords: CRISPR; Cancer; Essential genes; Functional genomics; Genetic screens.


Fig. 1
Fig. 1
BAGEL overview. a Simulated growth curves of wildtype cells (blue), which double at every time increment. When genetic perturbations are induced (T = 3), moderate (purple) to severe (magenta) fitness defects, growth arrest (red), and cell death (black) result in different relative growth rates. At sampled timepoints, fold change relative to wildtype growth is the readout from a sequencing assay. b Representative data from one replicate. The fold change distribution of all gRNA targeting essential genes (red) is shifted relative to the fold change distribution of all gRNA targeting nonessential genes (blue). The fold change distribution for all gRNA (black) is shown for reference. c The log likelihood functions of the red and blue curves from (b), left Y axis. The BAGEL method calculates the log likelihood ratio (black, right Y axis) of these two curves, within empirical boundaries (green dashes), for each bootstrap iteration; see Methods for details
Fig. 2
Fig. 2
Precision-recall curves for BAGEL results for GBM (a), HCT116 (b), HeLa (c), and RPE1 (d) screens using the TKO library. Where indicated, a single timepoint is plotted. “Integrated” = Bayes Factors summed across all timepoints in the experiment
Fig. 3
Fig. 3
Comparing early and late hits. a Number of fitness genes detected at early timepoint (cyan), late timepoint, (green), or both (blue) in each TKO screen. b Representative data from GBM screen. Most GO_BP terms enriched in late-only genes (green) extend observations of terms enriched in genes found in both early and late timepoints
Fig. 4
Fig. 4
Comparing BAGEL with MAGeCK. For each cell line, precision-recall curves were plotted for BAGEL and MAGeCK results using the last timepoint of the screen. Red circle indicates results at MAGeCK-reported 10 % FDR cutoff. a-d TKO screens from Hart et al. [15] e-h Screens from Wang et al. [17]

Similar articles

See all similar articles

Cited by 33 articles

See all "Cited by" articles


    1. Carette JE, Guimaraes CP, Wuethrich I, Blomen VA, Varadarajan M, Sun C, Bell G, Yuan B, Muellner MK, Nijman SM, et al. Global gene disruption in human cells to assign genes to phenotypes by deep sequencing. Nat Biotechnol. 2011;29(6):542–6. doi: 10.1038/nbt.1857. - DOI - PMC - PubMed
    1. Echeverri CJ, Beachy PA, Baum B, Boutros M, Buchholz F, Chanda SK, Downward J, Ellenberg J, Fraser AG, Hacohen N, et al. Minimizing the risk of reporting false positives in large-scale RNAi screens. Nat Methods. 2006;3(10):777–9. doi: 10.1038/nmeth1006-777. - DOI - PubMed
    1. Hart T, Brown KR, Sircoulomb F, Rottapel R, Moffat J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol Syst Biol. 2014;10:733. doi: 10.15252/msb.20145216. - DOI - PMC - PubMed
    1. Kaelin WG., Jr Molecular biology. Use and abuse of RNAi to study mammalian gene function. Science. 2012;337(6093):421–2. doi: 10.1126/science.1225787. - DOI - PMC - PubMed
    1. Chen S, Sanjana NE, Zheng K, Shalem O, Lee K, Shi X, Scott DA, Song J, Pan JQ, Weissleder R, et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell. 2015;160(6):1246–60. doi: 10.1016/j.cell.2015.02.038. - DOI - PMC - PubMed

LinkOut - more resources