CAPER 3.0: A Scalable Cloud-Based System for Data-Intensive Analysis of Chromosome-Centric Human Proteome Project Data Sets

J Proteome Res. 2015 Sep 4;14(9):3720-8. doi: 10.1021/pr501335w. Epub 2015 Mar 27.


The Chromosome-centric Human Proteome Project (C-HPP) aims to catalog genome-encoded proteins using a chromosome-by-chromosome strategy. As the C-HPP proceeds, the increasing requirement for data-intensive analysis of the MS/MS data poses a challenge to the proteomic community, especially small laboratories lacking computational infrastructure. To address this challenge, we have updated the previous CAPER browser into a higher version, CAPER 3.0, which is a scalable cloud-based system for data-intensive analysis of C-HPP data sets. CAPER 3.0 uses cloud computing technology to facilitate MS/MS-based peptide identification. In particular, it can use both public and private cloud, facilitating the analysis of C-HPP data sets. CAPER 3.0 provides a graphical user interface (GUI) to help users transfer data, configure jobs, track progress, and visualize the results comprehensively. These features enable users without programming expertise to easily conduct data-intensive analysis using CAPER 3.0. Here, we illustrate the usage of CAPER 3.0 with four specific mass spectral data-intensive problems: detecting novel peptides, identifying single amino acid variants (SAVs) derived from known missense mutations, identifying sample-specific SAVs, and identifying exon-skipping events. CAPER 3.0 is available at

Keywords: Chromosome-centric Human Proteome Project; Proteomic data analysis platform; big data; bioinformatics; cloud computing; proteomic data visualization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Chromosome Mapping*
  • Cloud Computing*
  • Databases, Protein*
  • Humans
  • Molecular Sequence Data
  • Polymorphism, Genetic
  • Proteins / chemistry
  • Proteins / genetics*
  • Proteome*


  • Proteins
  • Proteome