Identification and characterization of proteins encoded by chromosome 12 as part of chromosome-centric human proteome project

J Proteome Res. 2014 Jul 3;13(7):3166-77. doi: 10.1021/pr401123v. Epub 2014 Jun 24.


Chromosome-centric human proteome project (C-HPP) is a global initiative to comprehensively characterize proteins encoded by genes across all human chromosomes by teams focusing on individual chromosomes. Here, we report mass spectrometry-based identification and characterization of proteins encoded by genes on chromosome 12. Our study is based on proteomic profiling of 30 different histologically normal human tissues and cell types using high-resolution mass spectrometry. In our analysis, we identified 1,535 proteins encoded by 836 genes on human chromosome 12. This includes 89 genes that are designated as "missing proteins" by "neXtProt" as they did not have any prior evidence either by mass spectrometry or by antibody-based detection methods. We identified several variant peptides that reflected coding SNPs annotated in dbSNP database. We also confirmed the start sites of ∼200 proteins by identifying protein N-terminal acetylated peptides. We also identified alternative start sites for 11 proteins that were not annotated in public databases until now. Most importantly, we identified 12 novel protein coding regions on chromosome 12 using our proteogenomics strategy. All of the 12 regions have been annotated as pseudogenes in public databases. This study demonstrates that there is scope for significantly improving annotation of protein coding genes in the human genome using mass-spectrometry-derived data. Individual efforts as part of C-HPP initiative should significantly contribute toward enriching human protein annotation. The data have been deposited to ProteomeXchange with identifier PXD000561.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Amino Acid Sequence
  • Base Sequence
  • Chromosome Mapping
  • Chromosomes, Human, Pair 12 / genetics*
  • Female
  • Humans
  • Male
  • Molecular Sequence Annotation
  • Open Reading Frames
  • Polymorphism, Single Nucleotide
  • Proteome / genetics*
  • Proteome / physiology
  • RNA, Untranslated / genetics
  • Tandem Mass Spectrometry


  • Proteome
  • RNA, Untranslated