Radiomics analysis using stability selection supervised component analysis for right-censored survival data

Comput Biol Med. 2020 Sep:124:103959. doi: 10.1016/j.compbiomed.2020.103959. Epub 2020 Aug 6.


Radiomics is a newly emerging field that involves the extraction of massive quantitative features from biomedical images by using data-characterization algorithms. Distinctive imaging features identified from biomedical images can be used for prognosis and therapeutic response prediction, and they can provide a noninvasive approach for personalized therapy. So far, many of the published radiomics studies utilize existing out of the box algorithms to identify the prognostic markers from biomedical images that are not specific to radiomics data. To better utilize biomedical images, we propose a novel machine learning approach, stability selection supervised principal component analysis (SSSuperPCA) that identifies stable features from radiomics big data coupled with dimension reduction for right-censored survival outcomes. The proposed approach allows us to identify a set of stable features that are highly associated with the survival outcomes in a simple yet meaningful manner, while controlling the per-family error rate. We evaluate the performance of SSSuperPCA using simulations and real data sets for non-small cell lung cancer and head and neck cancer, and compare it with other machine learning algorithms. The results demonstrate that our method has a competitive edge over other existing methods in identifying the prognostic markers from biomedical imaging data for the prediction of right-censored survival outcomes.

Keywords: Bioinformatics; Data mining; Dimensionality reduction; Machine learning; Radiomics.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Carcinoma, Non-Small-Cell Lung* / diagnostic imaging
  • Humans
  • Lung Neoplasms* / diagnostic imaging
  • Machine Learning
  • Principal Component Analysis*