Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data

BMC Bioinformatics. 2021 Sep 22;22(Suppl 10):455. doi: 10.1186/s12859-021-04370-7.

Abstract

Background: With the rapid advancement of genomic sequencing techniques, massive production of gene expression data is becoming possible, which prompts the development of precision medicine. Deep learning is a promising approach for phenotype prediction (clinical diagnosis, prognosis, and drug response) based on gene expression profile. Existing deep learning models are usually considered as black-boxes that provide accurate predictions but are not interpretable. However, accuracy and interpretation are both essential for precision medicine. In addition, most models do not integrate the knowledge of the domain. Hence, making deep learning models interpretable for medical applications using prior biological knowledge is the main focus of this paper.

Results: In this paper, we propose a new self-explainable deep learning model, called Deep GONet, integrating the Gene Ontology into the hierarchical architecture of the neural network. This model is based on a fully-connected architecture constrained by the Gene Ontology annotations, such that each neuron represents a biological function. The experiments on cancer diagnosis datasets demonstrate that Deep GONet is both easily interpretable and highly performant to discriminate cancer and non-cancer samples.

Conclusions: Our model provides an explanation to its predictions by identifying the most important neurons and associating them with biological functions, making the model understandable for biologists and physicians.

Keywords: Deep learning; Gene Ontology; Gene expression; Model interpretation; Phenotype prediction.

MeSH terms

  • Gene Expression
  • Gene Ontology
  • Humans
  • Neoplasms*
  • Neural Networks, Computer*
  • Phenotype