Phosphorylation-driven cell signaling governs most biological functions and is widely studied using mass-spectrometry-based phosphoproteomics. Identifying the peptides and localizing the phosphorylation sites within them from the raw data is challenging and can be performed by several algorithms that return scores that are not directly comparable. This increases the heterogeneity among published phosphoproteomics data sets and prevents their direct integration. Here we compare 22 pipelines implemented in the main software tools used for bottom-up phosphoproteomics analysis (MaxQuant, Proteome Discoverer, PeptideShaker). We test six search engines (Andromeda, Comet, Mascot, MS Amanda, SequestHT, and X!Tandem) in combination with several localization scoring algorithms (delta score, D-score, PTM-score, phosphoRS, and Ascore). We show that these follow very different score distributions, which can lead to different false localization rates for the same threshold. We provide a strategy to discriminate correctly from incorrectly localized phosphorylation sites in a consistent manner across the tested pipelines. The results presented here can help users choose the most appropriate pipeline and cutoffs for their phosphoproteomics analysis.
Keywords: DDA; MS; MaxQuant; PTM; PeptideShaker; Proteome Discoverer; data-dependent acquisition; mass spectrometry; phosphoproteomics; phosphorylation.