signifinder enables the identification of tumor cell states and cancer expression signatures in bulk, single-cell and spatial transcriptomic data

bioRxiv [Preprint]. 2023 Mar 10:2023.03.07.530940. doi: 10.1101/2023.03.07.530940.

Abstract

Over the last decade, many studies and some clinical trials have proposed gene expression signatures as a valuable tool for understanding cancer mechanisms, defining subtypes, monitoring patient prognosis, and therapy efficacy. However, technical and biological concerns about reproducibility have been raised. Technical reproducibility is a major concern: we currently lack a computational implementation of the proposed signatures, which would provide detailed signature definition and assure reproducibility, dissemination, and usability of the classifier. Another concern regards intratumor heterogeneity, which has never been addressed when studying these types of biomarkers using bulk transcriptomics. With the aim of providing a tool able to improve the reproducibility and usability of gene expression signatures, we propose signifinder, an R package that provides the infrastructure to collect, implement, and compare expression-based signatures from cancer literature. The included signatures cover a wide range of biological processes from metabolism and programmed cell death, to morphological changes, such as quantification of epithelial or mesenchymal-like status. Collected signatures can score tumor cell characteristics, such as the predicted response to therapy or the survival association, and can quantify microenvironmental information, including hypoxia and immune response activity. signifinder has been used to characterize tumor samples and to investigate intra-tumor heterogeneity, extending its application to single-cell and spatial transcriptomic data. Through these higher-resolution technologies, it has become increasingly apparent that the single-sample score assessment obtained by transcriptional signatures is conditioned by the phenotypic and genetic intratumor heterogeneity of tumor masses. Since the characteristics of the most abundant cell type or clone might not necessarily predict the properties of mixed populations, signature prediction efficacy is lowered, thus impeding effective clinical diagnostics. Through signifinder, we offer general principles for interpreting and comparing transcriptional signatures, as well as suggestions for additional signatures that would allow for more complete and robust data inferences. We consider signifinder a useful tool to pave the way for reproducibility and comparison of transcriptional signatures in oncology.

Publication types

  • Preprint