Use of ENCODE resources to characterize novel proteoforms and missing proteins in the human proteome

J Proteome Res. 2015 Feb 6;14(2):603-8. doi: 10.1021/pr500564q. Epub 2014 Nov 26.


We describe the utility of integrated strategies that employ both translation of ENCODE data and major proteomic technology pillars to improve the identification of the "missing proteins", novel proteoforms, and PTMs. On one hand, databases in combination with bioinformatic tools are efficiently utilized to establish microarray-based transcript analysis and supply rapid protein identifications in clinical samples. On the other hand, sequence libraries are the foundation of targeted protein identification and quantification using mass spectrometric and immunoaffinity techniques. The results from combining proteoENCODEdb searches with experimental mass spectral data indicate that some alternative splicing forms detected at the transcript level are in fact translated to proteins. Our results provide a step toward the directives of the C-HPP initiative and related biomedical research.

Keywords: Chromosome-centric Human Protein Project; ENCODE; glioma stem cell; microassays; missing proteins; protein sequence mass spectrometry.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Protein Isoforms / chemistry
  • Proteome / chemistry*


  • Protein Isoforms
  • Proteome