Transparent Exploration of Machine Learning for Biomarker Discovery from Proteomics and Omics Data

J Proteome Res. 2023 Feb 3;22(2):359-367. doi: 10.1021/acs.jproteome.2c00473. Epub 2022 Nov 25.

Abstract

Biomarkers are of central importance for assessing the health state and to guide medical interventions and their efficacy; still, they are lacking for most diseases. Mass spectrometry (MS)-based proteomics is a powerful technology for biomarker discovery but requires sophisticated bioinformatics to identify robust patterns. Machine learning (ML) has become a promising tool for this purpose. However, it is sometimes applied in an opaque manner and generally requires specialized knowledge. To enable easy access to ML for biomarker discovery without any programming or bioinformatics skills, we developed "OmicLearn" (http://OmicLearn.org), an open-source browser-based ML tool using the latest advances in the Python ML ecosystem. Data matrices from omics experiments are easily uploaded to an online or a locally installed web server. OmicLearn enables rapid exploration of the suitability of various ML algorithms for the experimental data sets. It fosters open science via transparent assessment of state-of-the-art algorithms in a standardized format for proteomics and other omics sciences.

Keywords: diagnostics; machine learning; mass spectrometry; metabolome; omics; proteome; transcriptome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers / analysis
  • Ecosystem*
  • Machine Learning
  • Proteomics* / methods

Substances

  • Biomarkers