A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer

Sci Adv. 2022 Jan 21;8(3):eabg6711. doi: 10.1126/sciadv.abg6711. Epub 2022 Jan 19.


Tumors display widespread transcriptome alterations, but the full repertoire of isoform-level alternative splicing in cancer is unknown. We developed a long-read (LR) RNA sequencing and analytical platform that identifies and annotates full-length isoforms and infers tumor-specific splicing events. Application of this platform to breast cancer samples identifies thousands of previously unannotated isoforms; ~30% affect protein coding exons and are predicted to alter protein localization and function. We performed extensive cross-validation with -omics datasets to support transcription and translation of novel isoforms. We identified 3059 breast tumor–specific splicing events, including 35 that are significantly associated with patient survival. Of these, 21 are absent from GENCODE and 10 are enriched in specific breast cancer subtypes. Together, our results demonstrate the complexity, cancer subtype specificity, and clinical relevance of previously unidentified isoforms and splicing events in breast cancer that are only annotatable by LR-seq and provide a rich resource of immuno-oncology therapeutic targets.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Breast Neoplasms* / genetics
  • Female
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Protein Isoforms / genetics
  • Protein Isoforms / metabolism
  • Sequence Analysis, RNA / methods
  • Transcriptome


  • Protein Isoforms