Comparative bioinformatic analysis of complete proteomes and protein parameters for cross-species identification in proteomics

Proteomics. 2002 Oct;2(10):1392-405. doi: 10.1002/1615-9861(200210)2:10<1392::AID-PROT1392>3.0.CO;2-L.


Peptide mass fingerprinting (PMF) remains the most amenable technique for protein identification in proteomics, using mass spectrometry as the primary analytical technique coupled with bioinformatics. This relies on the presence of the amino acid sequence of the protein in the current databanks. Despite this, it is desirable to be able to use the technique for organisms whose genomes are not yet fully sequenced and apply cross-species protein identification. In this study, we have re-examined the feasibility of such approaches by considering the extent of protein similarity between genome sequences using a data set of 29 complete bacterial and two eukaryotic genomes. A range of protein and peptide features are considered, including protein isoelectric focussing point, protein mass, and amino acid conservation. The effectiveness of PMF approaches has then been tested with a series of computer simulations with varying peptide number and mass accuracy for several cross-species tests. The results show that PMF alone is unsuitable in general for divergent species jumps, or when protein similarity is less than 70% identity. Despite this, there exists a considerable enrichment above random of tryptic peptide conservation and PMF promises to remain useful when combined with other data than just peptide masses for cross-species protein identification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Computational Biology* / methods*
  • Computational Biology* / trends*
  • Databases as Topic
  • Escherichia coli / genetics
  • Genome, Bacterial
  • Genome, Fungal
  • Hydrogen-Ion Concentration
  • Mass Spectrometry
  • Proteome*
  • Saccharomyces cerevisiae / metabolism
  • Schizosaccharomyces / metabolism
  • Species Specificity


  • Amino Acids
  • Proteome