What does it mean to identify a protein in proteomics?

Trends Biochem Sci. 2002 Feb;27(2):74-8. doi: 10.1016/s0968-0004(01)02021-7.


The annotation of the human genome indicates the surprisingly low number of approximately 40,000 genes. However, the estimated number of proteins encoded by these genes is two to three orders of magnitude higher. The ability to unambiguously identify the proteins is a prerequisite for their functional investigation. As proteins derived from the same gene can be largely identical, and might differ only in small but functionally relevant details, protein identification tools must not only identify a large number of proteins but also be able to differentiate between close relatives. This information can be generated by mass spectrometry, an approach that identifies proteins by partial analysis of their digestion-derived peptides. Information gleaned from databases fills in the missing sequence information. Because both sequence databases and experimental data are limited, a certain ambiguity often remains concerning which sequence variant(s) and modification(s) are present. As the common denominator of all the isoforms is a gene, in our opinion, it would be more accurate to state that a product of this particular gene rather than a certain protein has been identified by mass spectrometry.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Gene Expression Profiling
  • Mass Spectrometry / methods
  • Peptide Mapping
  • Proteins / chemistry*
  • Proteome / chemistry*


  • Proteins
  • Proteome