Type-2 Fuzzy PCA Approach in Extracting Salient Features for Molecular Cancer Diagnostics and Prognostics

IEEE Trans Nanobioscience. 2019 Jul;18(3):482-489. doi: 10.1109/TNB.2019.2917814. Epub 2019 May 20.


Machine learning is becoming a powerful tool for cancer diagnosis and prognosis based on classification using high dimensional molecular data. However, extracting classification features from high-dimensional datasets remains a challenging problem. Principal component analysis (PCA) is a widely used method for dimensionality reduction. However, it is well-known that PCA and most PCA-based feature extraction methods are sensitive to noise, which may affect the accuracy of the subsequent classification. To address this problem, here we have proposed a robust fuzzy principal component analysis (PCA) with interval type-2 (IT-2) fuzzy membership functions for feature extraction. We have tested the performance of three widely used classifiers using the features extracted by proposed approaches and other feature extraction methods - PCA-based feature extraction methods (i.e. conventional PCA and fuzzy PCA), linear discriminant analysis (LDA), and support vector machine recursive feature elimination (SVM-RFE). The proposed feature extraction approaches showed better performance on cancer transcriptome and proteome datasets.

MeSH terms

  • Computational Biology
  • Databases, Genetic
  • Fuzzy Logic*
  • Gene Expression Profiling / methods*
  • Humans
  • Machine Learning
  • Neoplasms* / diagnosis
  • Neoplasms* / genetics
  • Neoplasms* / metabolism
  • Principal Component Analysis*
  • Prognosis
  • Transcriptome