pNovo: de novo peptide sequencing and identification using HCD spectra

J Proteome Res. 2010 May 7;9(5):2713-24. doi: 10.1021/pr100182k.

Abstract

De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only approximately 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Cattle
  • Chickens
  • Data Mining
  • Databases, Protein
  • Escherichia coli Proteins
  • Glycine max
  • Molecular Sequence Data
  • Peptide Fragments / chemistry*
  • Proteins / chemistry
  • Rabbits
  • Sequence Analysis, Protein / methods*
  • Software
  • Tandem Mass Spectrometry / methods*

Substances

  • Escherichia coli Proteins
  • Peptide Fragments
  • Proteins