Top-Down Proteomics of Human Saliva, Analyzed with Logistic Regression and Machine Learning Methods, Reveal Molecular Signatures of Ovarian Cancer

Int J Mol Sci. 2023 Oct 28;24(21):15716. doi: 10.3390/ijms242115716.


Ovarian cancer (OC) is the most lethal of all gynecological cancers. Due to vague symptoms, OC is mostly detected at advanced stages, with a 5-year survival rate (SR) of only 30%; diagnosis at stage I increases the 5-year SR to 90%, suggesting that early diagnosis is essential to cure OC. Currently, the clinical need for an early, reliable diagnostic test for OC screening remains unmet; indeed, screening is not even recommended for healthy women with no familial history of OC for fear of post-screening adverse events. Salivary diagnostics is considered a major resource for diagnostics of the future. In this work, we searched for OC biomarkers (BMs) by comparing saliva samples of patients with various stages of OC, breast cancer (BC) patients, and healthy subjects using an unbiased, high-throughput proteomics approach. We analyzed the results using both logistic regression (LR) and machine learning (ML) for pattern analysis and variable selection to highlight molecular signatures for OC and BC diagnosis and possibly re-classification. Here, we show that saliva is an informative test fluid for an unbiased proteomic search of candidate BMs for identifying OC patients. Although we were not able to fully exploit the potential of ML methods due to the small sample size of our study, LR and ML provided patterns of candidate BMs that are now available for further validation analysis in the relevant population and for biochemical identification.

Keywords: SELDI-TOF-MS; biomarkers; breast cancer; logistic regression; machine learning; mass spectrometry; ovarian cancer; proteomics; saliva.

MeSH terms

  • Biomarkers, Tumor
  • Female
  • Humans
  • Logistic Models
  • Machine Learning
  • Ovarian Neoplasms* / diagnosis
  • Proteomics / methods
  • Saliva*


  • Biomarkers, Tumor