Classification and gene selection of triple-negative breast cancer subtype embedding gene connectivity matrix in deep neural network

Brief Bioinform. 2021 Sep 2;22(5):bbaa395. doi: 10.1093/bib/bbaa395.

Abstract

Triple-negative breast cancer (TNBC) has been a challenging breast cancer subtype for oncological therapy. Normally, it can be classified into different molecular subtypes. Accurate and stable classification of the six subtypes is essential for personalized treatment of TNBC. In this study, we proposed a new framework to distinguish the six subtypes of TNBC, and this is one of the handful studies that completed the classification based on mRNA and long noncoding RNA expression data. Particularly, we developed a gene selection approach named DGGA, which takes correlation information between genes into account in the process of measuring gene importance and then effectively removes redundant genes. A gene scoring approach that combined GeneRank scores with gene importance generated by deep neural network (DNN), taking inter-subtype discrimination and inner-gene correlations into account, was came up to improve gene selection performance. More importantly, we embedded a gene connectivity matrix in the DNN for sparse learning, which takes additional consideration with weight changes during training when obtaining the measurement of the relative importance of each gene. Finally, Genetic Algorithm was used to simulate the natural evolutionary process to search for the optimal subset of TNBC subtype classification. We validated the proposed method through cross-validation, and the results demonstrate that it can use fewer genes to obtain more accurate classification results. The implementation for the proposed method is available at https://github.com/RanSuLab/TNBC.

Keywords: GeneRank; TNBC subtype; deep neural network; gene selection; weight variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Antineoplastic Agents / therapeutic use
  • Female
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks
  • Humans
  • Neoplasm Proteins / genetics*
  • Neoplasm Proteins / metabolism
  • Neural Networks, Computer*
  • Precision Medicine
  • RNA, Long Noncoding / genetics*
  • RNA, Long Noncoding / metabolism
  • RNA, Messenger / genetics*
  • RNA, Messenger / metabolism
  • Triple Negative Breast Neoplasms / classification*
  • Triple Negative Breast Neoplasms / drug therapy
  • Triple Negative Breast Neoplasms / genetics*
  • Triple Negative Breast Neoplasms / pathology

Substances

  • Antineoplastic Agents
  • Neoplasm Proteins
  • RNA, Long Noncoding
  • RNA, Messenger