Applying bioinformatics to proteomics: is machine learning the answer to biomarker discovery for PD and MSA?

Mov Disord. 2012 Nov;27(13):1595-7. doi: 10.1002/mds.25189. Epub 2012 Oct 31.


Bioinformatics tools are increasingly being applied to proteomic data to facilitate the identification of biomarkers and classification of patients. In the June, 2012 issue, Ishigami et al. used principal component analysis (PCA) to extract features and support vector machine (SVM) to differentiate and classify cerebrospinal fluid (CSF) samples from two small cohorts of patients diagnosed with either Parkinson's disease (PD) or multiple system atrophy (MSA) based on differences in the patterns of peaks generated with matrix-assisted desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). PCA accurately segregated patients with PD and MSA from controls when the cohorts were combined, but did not perform well when segregating PD from MSA. On the other hand, SVM, a machine learning classification model, correctly classified the samples from patients with early PD or MSA, and the peak at m/z 6250 was identified as a strong contributor to the ability of SVM to distinguish the proteomic profiles of either cohort when trained on one cohort. This study, while preliminary, provides promising results for the application of bioinformatics tools to proteomic data, an approach that may eventually facilitate the ability of clinicians to differentiate and diagnose closely related parkinsonian disorders.

Publication types

  • Comment

MeSH terms

  • Cerebrospinal Fluid Proteins / cerebrospinal fluid*
  • Female
  • Humans
  • Male
  • Multiple System Atrophy / cerebrospinal fluid*
  • Multiple System Atrophy / diagnosis*
  • Parkinson Disease / cerebrospinal fluid*
  • Parkinson Disease / diagnosis*
  • Proteomics / methods*


  • Cerebrospinal Fluid Proteins