Background: The discovery of novel protein biomarkers is essential in the clinical setting to enable early disease diagnosis and increase survivability rates. To facilitate differential expression analysis and biomarker discovery, a variety of tandem mass spectrometry (MS/MS)-based protein profiling techniques have been developed. For achieving sensitive detection and accurate quantitation, targeted MS screening approaches, such as multiple reaction monitoring (MRM), have been implemented.
Methods: MCF-7 breast cancer protein cellular extracts were analyzed by 2D-strong cation exchange (SCX)/reversed phase liquid chromatography (RPLC) separations interfaced to linear ion trap MS detection. MS data were interpreted with the Sequest-based Bioworks software (Thermo Electron). In-house developed Perl-scripts were used to calculate the spectral counts and the representative fragment ions for each peptide.
Results: In this work, we report on the generation of a library of 9,677 peptides (p < 0.001), representing approximately 1,572 proteins from human breast cancer cells, that can be used for MRM/MS-based biomarker screening studies. For each protein, the library provides the number and sequence of detectable peptides, the charge state, the spectral count, the molecular weight, the parameters that characterize the quality of the tandem mass spectrum (p-value, DeltaM, Xcorr, DeltaCn, Sp, no. of matching a, b, y ions in the spectrum), the retention time, and the top 10 most intense product ions that correspond to a given peptide. Only proteins identified by at least two spectral counts are listed. The experimental distribution of protein frequencies, as a function of molecular weight, closely matched the theoretical distribution of proteins in the human proteome, as provided in the SwissProt database. The amino acid sequence coverage of the identified proteins ranged from 0.04% to 98.3%. The highest-abundance proteins in the cellular extract had a molecular weight (MW)<50,000.
Conclusion: Preliminary experiments have demonstrated that putative biomarkers, that are not detectable by conventional data dependent MS acquisition methods in complex un-fractionated samples, can be reliable identified with the information provided in this library. Based on the spectral count, the quality of a tandem mass spectrum and the m/z values for a parent peptide and its most abundant daughter ions, MRM conditions can be selected to enable the detection of target peptides and proteins.