De novo peptide identification via tandem mass spectrometry and integer linear optimization

Anal Chem. 2007 Feb 15;79(4):1433-46. doi: 10.1021/ac0618425.


A novel methodology for the automated de novo identification of peptides via integer linear optimization (also referred to as integer linear programming or ILP) and tandem mass spectrometry is presented in this article. The various features of the mathematical model are presented and examples are used to illustrate the key concepts of the proposed approach. A variety of challenging peptide identification problems, accompanied by a comparative study with five state-of-the-art methods, are examined to illustrate the proposed method's ability to address (a) residue-dependent fragmentation properties that result in missing ion peaks and (b) the variability of resolution in different mass analyzers. A preprocessing algorithm is utilized to identify important m/z values in the tandem mass spectrum. Missing peaks, due to residue-dependent fragmentation characteristics, are dealt with using a two-stage algorithmic framework. A cross-correlation approach is used to resolve missing amino acid assignments and to select the most probable peptide by comparing the theoretical spectra of the candidate sequences that were generated from the ILP sequencing stages with the experimental tandem mass spectrum. The novel, proposed de novo method, denoted as PILOT, is compared to existing popular methods such as Lutefisk, PEAKS, PepNovo, EigenMS, and NovoHMM for a set of spectra resulting from QTOF and ion trap instruments.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases as Topic
  • Peptides / chemistry*
  • Programming, Linear*
  • Sensitivity and Specificity
  • Tandem Mass Spectrometry / methods*


  • Peptides