Systematic analysis of missing proteins provides clues to help define all of the protein-coding genes on human chromosome 1

J Proteome Res. 2014 Jan 3;13(1):114-25. doi: 10.1021/pr400900j. Epub 2013 Nov 25.


Our first proteomic exploration of human chromosome 1 began in 2012 (CCPD 1.0), and the genome-wide characterization of the human proteome through public resources revealed that 32-39% of proteins on chromosome 1 remain unidentified. To characterize all of the missing proteins, we applied an OMICS-integrated analysis of three human liver cell lines (Hep3B, MHCC97H, and HCCLM3) using mRNA and ribosome nascent-chain complex-bound mRNA deep sequencing and proteome profiling, contributing mass spectrometric evidence of 60 additional chromosome 1 gene products. Integration of the annotation information from public databases revealed that 84.6% of genes on chromosome 1 had high-confidence protein evidence. Hierarchical analysis demonstrated that the remaining 320 missing genes were either experimentally or biologically explainable; 128 genes were found to be tissue-specific or rarely expressed in some tissues, whereas 91 proteins were uncharacterized mainly due to database annotation diversity, 89 were genes with low mRNA abundance or unsuitable protein properties, and 12 genes were identifiable theoretically because of a high abundance of mRNAs/RNC-mRNAs and the existence of proteotypic peptides. The relatively large contribution made by the identification of enriched transcription factors suggested specific enrichment of low-abundance protein classes, and SRM/MRM could capture high-priority missing proteins. Detailed analyses of the differentially expressed genes indicated that several gene families located on chromosome 1 may play critical roles in mediating hepatocellular carcinoma invasion and metastasis. All mass spectrometry proteomics data corresponding to our study were deposited in the ProteomeXchange under the identifiers PXD000529, PXD000533, and PXD000535.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line, Tumor
  • Chromosomes, Human, Pair 1*
  • Humans
  • Proteins / genetics*
  • Proteomics


  • Proteins