Cancer-Alterome: a literature-mined resource for regulatory events caused by genetic alterations in cancer

Sci Data. 2024 Mar 2;11(1):265. doi: 10.1038/s41597-024-03083-9.


It is vital to investigate the complex mechanisms underlying tumors to better understand cancer and develop effective treatments. Metabolic abnormalities and clinical phenotypes can serve as essential biomarkers for diagnosing this challenging disease. Additionally, genetic alterations provide profound insights into the fundamental aspects of cancer. This study introduces Cancer-Alterome, a literature-mined dataset that focuses on the regulatory events of an organism's biological processes or clinical phenotypes caused by genetic alterations. By proposing and leveraging a text-mining pipeline, we identify 16,681 thousand of regulatory events records encompassing 21K genes, 157K genetic alterations and 154K downstream bio-concepts, extracted from 4,354K pan-cancer literature. The resulting dataset empowers a multifaceted investigation of cancer pathology, enabling the meticulous tracking of relevant literature support. Its potential applications extend to evidence-based medicine and precision medicine, yielding valuable insights for further advancements in cancer research.

Publication types

  • Dataset

MeSH terms

  • Data Mining / methods
  • Humans
  • Neoplasms* / genetics
  • Phenotype
  • Precision Medicine* / methods