A comprehensive genomic and transcriptomic dataset of triple-negative breast cancers

Sci Data. 2022 Sep 24;9(1):587. doi: 10.1038/s41597-022-01681-z.

Abstract

Molecular subtyping of triple-negative breast cancer (TNBC) is essential for understanding the mechanisms and discovering actionable targets of this highly heterogeneous type of breast cancer. We previously performed a large single-center and multiomics study consisting of genomics, transcriptomics, and clinical information from 465 patients with primary TNBC. To facilitate reusing this unique dataset, we provided a detailed description of the dataset with special attention to data quality in this study. The multiomics data were generally of high quality, but a few sequencing data had quality issues and should be noted in subsequent data reuse. Furthermore, we reconduct data analyses with updated pipelines and the updated version of the human reference genome from hg19 to hg38. The updated profiles were in good concordance with those previously published in terms of gene quantification, variant calling, and copy number alteration. Additionally, we developed a user-friendly web-based database for convenient access and interactive exploration of the dataset. Our work will facilitate reusing the dataset, maximize the values of data and further accelerate cancer research.

Publication types

  • Dataset

MeSH terms

  • DNA Copy Number Variations
  • Female
  • Genome, Human
  • Genomics
  • Humans
  • Transcriptome*
  • Triple Negative Breast Neoplasms* / genetics