EPIC: A Tool to Estimate the Proportions of Different Cell Types from Bulk Gene Expression Data

Methods Mol Biol. 2020;2120:233-248. doi: 10.1007/978-1-0716-0327-7_17.

Abstract

Gene expression profiling is nowadays routinely performed on clinically relevant samples (e.g., from tumor specimens). Such measurements are often obtained from bulk samples containing a mixture of cell types. Knowledge of the proportions of these cell types is crucial as they are key determinants of the disease evolution and response to treatment. Moreover, heterogeneity in cell type proportions across samples is an important confounding factor in downstream analyses.Many tools have been developed to estimate the proportion of the different cell types from bulk gene expression data. Here, we provide guidelines and examples on how to use these tools, with a special focus on our recent computational method EPIC (Estimating the Proportions of Immune and Cancer cells). EPIC includes RNA-seq-based gene expression reference profiles from immune cells and other nonmalignant cell types found in tumors. EPIC can additionally manage user-defined gene expression reference profiles. Some unique features of EPIC include the ability to account for an uncharacterized cell type, the introduction of a renormalization step to account for different mRNA content in each cell type, and the use of single-cell RNA-seq data to derive biologically relevant reference gene expression profiles. EPIC is available as a web application ( http://epic.gfellerlab.org ) and as an R-package ( https://github.com/GfellerLab/EPIC ).

Keywords: Cell fraction predictions; Computational biology; Gene expression analysis; Immunoinformatics; RNA-seq deconvolution; Tumor immune microenvironment.

MeSH terms

  • Gene Expression Profiling / methods*
  • Genomics / methods
  • Humans
  • Neoplasms / genetics
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis / methods
  • Software*
  • Transcriptome*

Substances

  • RNA, Messenger