Selection of Features with Consistent Profiles Improves Relative Protein Quantification in Mass Spectrometry Experiments

Mol Cell Proteomics. 2020 Jun;19(6):944-959. doi: 10.1074/mcp.RA119.001792. Epub 2020 Mar 31.

Abstract

In bottom-up mass spectrometry-based proteomics, relative protein quantification is often achieved with data-dependent acquisition (DDA), data-independent acquisition (DIA), or selected reaction monitoring (SRM). These workflows quantify proteins by summarizing the abundances of all the spectral features of the protein (e.g. precursor ions, transitions or fragments) in a single value per protein per run. When abundances of some features are inconsistent with the overall protein profile (for technological reasons such as interferences, or for biological reasons such as post-translational modifications), the protein-level summaries and the downstream conclusions are undermined. We propose a statistical approach that automatically detects spectral features with such inconsistent patterns. The detected features can be separately investigated, and if necessary, removed from the data set. We evaluated the proposed approach on a series of benchmark-controlled mixtures and biological investigations with DDA, DIA and SRM data acquisitions. The results demonstrated that it could facilitate and complement manual curation of the data. Moreover, it can improve the estimation accuracy, sensitivity and specificity of detecting differentially abundant proteins, and reproducibility of conclusions across different data processing tools. The approach is implemented as an option in the open-source R-based software MSstats.

Keywords: Statistics; bioinformatics; biostatistics; computational biology; label-free quantification; mass spectrometry; multiple reaction monitoring; quantification; selected reaction monitoring; targeted mass spectrometry.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Protein
  • Mass Spectrometry / methods*
  • Protein Processing, Post-Translational
  • Proteins / analysis*
  • Proteomics / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software

Substances

  • Proteins