Background: Developments in 'soft' ionisation techniques have revolutionized mass-spectro-metric approaches for the analysis of protein structure. For more than a decade, such techniques have been used, in conjuction with digestion b specific proteases, to produce accurate peptide molecular weight 'fingerprints' of proteins. These fingerprints have commonly been used to screen known proteins, in order to detect errors of translation, to characterize post-translational modifications and to assign diulphide bonds. However, the extent to which peptide-mass information can be used alone to identify unknown sample proteins, independent of other analytical methods such as protein sequence analysis, has remained largely unexplored.
Results: We report here on the development of the molecular weight search (MOWSE) peptide-mass database at the SERC Daresbury Laboratory. Practical experience has shown that sample proteins can be uniquely identified from a few as three or four experimentally determined peptide masses when these are screened against a fragment database that is derived from over 50 000 proteins. Experimental errors of a few Daltons are tolerated by the scoring algorithms, thus permitting the use of inexpensive time-of-flight mass spectrometers. As with other types of physical data, such as amino-acid composition or linear sequence, peptide masses provide a set of determinants that are sufficiently discriminating to identify or match unknown sample proteins.
Conclusion: Peptide-mass fingerprints can prove as discriminating as linear peptide sequences, but can be obtained in a fraction of the time using less protein. In many cases, this allows for a rapid identification of a sample protein before committing it to protein sequence analysis. Fragment masses also provide information, at the protein level, that is complementary to the information provided by large-scale DNA sequencing or mapping projects.