Proteogenomic data and resources for pan-cancer analysis

Cancer Cell. 2023 Aug 14;41(8):1397-1406. doi: 10.1016/j.ccell.2023.06.009.


The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) investigates tumors from a proteogenomic perspective, creating rich multi-omics datasets connecting genomic aberrations to cancer phenotypes. To facilitate pan-cancer investigations, we have generated harmonized genomic, transcriptomic, proteomic, and clinical data for >1000 tumors in 10 cohorts to create a cohesive and powerful dataset for scientific discovery. We outline efforts by the CPTAC pan-cancer working group in data harmonization, data dissemination, and computational resources for aiding biological discoveries. We also discuss challenges for multi-omics data integration and analysis, specifically the unique challenges of working with both nucleotide sequencing and mass spectrometry proteomics data.

Keywords: CPTAC; data harmonization; multi-omics; open data; pan-cancer; proteogenomics.

Publication types

  • Review
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling
  • Genomics
  • Humans
  • Neoplasms* / genetics
  • Proteogenomics*
  • Proteomics