A Bayesian algorithm for detecting differentially expressed proteins and its application in breast cancer research

Sci Rep. 2016 Jul 22:6:30159. doi: 10.1038/srep30159.

Abstract

Presence of considerable noise and missing data points make analysis of mass-spectrometry (MS) based proteomic data a challenging task. The missing values in MS data are caused by the inability of MS machines to reliably detect proteins whose abundances fall below the detection limit. We developed a Bayesian algorithm that exploits this knowledge and uses missing data points as a complementary source of information to the observed protein intensities in order to find differentially expressed proteins by analysing MS based proteomic data. We compared its accuracy with many other methods using several simulated datasets. It consistently outperformed other methods. We then used it to analyse proteomic screens of a breast cancer (BC) patient cohort. It revealed large differences between the proteomic landscapes of triple negative and Luminal A, which are the most and least aggressive types of BC. Unexpectedly, majority of these differences could be attributed to the direct transcriptional activity of only seven transcription factors some of which are known to be inactive in triple negative BC. We also identified two new proteins which significantly correlated with the survival of BC patients, and therefore may have potential diagnostic/prognostic values.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Female
  • Humans
  • Mass Spectrometry / methods
  • Prognosis
  • Proteome / metabolism*
  • Proteomics / methods
  • Transcription, Genetic / physiology
  • Triple Negative Breast Neoplasms / metabolism*
  • Triple Negative Breast Neoplasms / pathology

Substances

  • Proteome