The Antibody Repertoire of Colorectal Cancer

Mol Cell Proteomics. 2017 Dec;16(12):2111-2124. doi: 10.1074/mcp.RA117.000397. Epub 2017 Oct 18.


Immunotherapy is becoming increasingly important in the fight against cancers, using and manipulating the body's immune response to treat tumors. Understanding the immune repertoire-the collection of immunological proteins-of treated and untreated cells is possible at the genomic, but technically difficult at the protein level. Standard protein databases do not include the highly divergent sequences of somatic rearranged immunoglobulin genes, and may lead to miss identifications in a mass spectrometry search. We introduce a novel proteogenomic approach, AbScan, to identify these highly variable antibody peptides, by developing a customized antibody database construction method using RNA-seq reads aligned to immunoglobulin (Ig) genes.AbScan starts by filtering transcript (RNA-seq) reads that match the template for Ig genes. The retained reads are used to construct a repertoire graph using the "split" de Bruijn graph: a graph structure that improves on the standard de Bruijn graph to capture the high diversity of Ig genes in a compact manner. AbScan corrects for sequencing errors, and converts the graph to a format suitable for searching with MS/MS search tools. We used AbScan to create an antibody database from 90 RNA-seq colorectal tumor samples. Next, we used proteogenomic analysis to search MS/MS spectra of matched colorectal samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) against the AbScan generated database. AbScan identified 1,940 distinct antibody peptides. Correlating with previously identified Single Amino-Acid Variants (SAAVs) in the tumor samples, we identified 163 pairs (antibody peptide, SAAV) with significant cooccurrence pattern in the 90 samples. The presence of coexpressed antibody and mutated peptides was correlated with survival time of the individuals. Our results suggest that AbScan ( is an effective tool for a proteomic exploration of the immune response in cancers.

MeSH terms

  • Algorithms
  • Cell Line, Tumor
  • Colorectal Neoplasms / genetics
  • Colorectal Neoplasms / immunology*
  • Databases, Genetic
  • Databases, Protein
  • Genomics / methods*
  • Humans
  • Immunoglobulins / chemistry*
  • Immunoglobulins / genetics
  • Peptides / chemistry
  • Peptides / genetics*
  • Proteomics / methods*
  • Sequence Analysis, RNA
  • Tandem Mass Spectrometry


  • Immunoglobulins
  • Peptides