A Tutorial for Variance-Sensitive Clustering and the Quantitative Analysis of Protein Complexes

Methods Mol Biol. 2021:2228:433-451. doi: 10.1007/978-1-0716-1024-4_30.

Abstract

Data clustering facilitates the identification of biologically relevant molecular features in quantitative proteomics experiments with thousands of measurements over multiple conditions. It finds groups of proteins or peptides with similar quantitative behavior across multiple experimental conditions. This co-regulatory behavior suggests that the proteins of such a group share their functional behavior and thus often can be mapped to the same biological processes and molecular subnetworks.While usual clustering approaches dismiss the variance of the measured proteins, VSClust combines statistical testing with pattern recognition into a common algorithm. Here, we show how to use the VSClust web service on a large proteomics data set and present further tools to assess the quantitative behavior of protein complexes.

Keywords: Bioinformatics; Biological pathways; Cluster analysis; Differential analysis; Multivariate analysis; Pattern recognition; Protein complexes; Proteomics.

MeSH terms

  • Breast Neoplasms / metabolism*
  • Cluster Analysis
  • Data Interpretation, Statistical
  • Databases, Protein
  • Female
  • Humans
  • Multiprotein Complexes
  • Neoplasm Proteins / analysis*
  • Protein Binding
  • Proteome*
  • Proteomics* / statistics & numerical data
  • Research Design
  • Software

Substances

  • Multiprotein Complexes
  • Neoplasm Proteins
  • Proteome