Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data

Victoria Bourgeais; Farida Zehraoui; Mohamed Ben Hamdoune; Blaise Hanczar

doi:10.1186/s12859-021-04370-7

Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data

BMC Bioinformatics. 2021 Sep 22;22(Suppl 10):455. doi: 10.1186/s12859-021-04370-7.

Authors

Victoria Bourgeais¹, Farida Zehraoui², Mohamed Ben Hamdoune², Blaise Hanczar²

Affiliations

¹ IBISC, Univ Evry, Université Paris-Saclay, 91020, Évry-Courcouronnes, France. victoria.bourgeais@univ-evry.fr.
² IBISC, Univ Evry, Université Paris-Saclay, 91020, Évry-Courcouronnes, France.

Abstract

Background: With the rapid advancement of genomic sequencing techniques, massive production of gene expression data is becoming possible, which prompts the development of precision medicine. Deep learning is a promising approach for phenotype prediction (clinical diagnosis, prognosis, and drug response) based on gene expression profile. Existing deep learning models are usually considered as black-boxes that provide accurate predictions but are not interpretable. However, accuracy and interpretation are both essential for precision medicine. In addition, most models do not integrate the knowledge of the domain. Hence, making deep learning models interpretable for medical applications using prior biological knowledge is the main focus of this paper.

Results: In this paper, we propose a new self-explainable deep learning model, called Deep GONet, integrating the Gene Ontology into the hierarchical architecture of the neural network. This model is based on a fully-connected architecture constrained by the Gene Ontology annotations, such that each neuron represents a biological function. The experiments on cancer diagnosis datasets demonstrate that Deep GONet is both easily interpretable and highly performant to discriminate cancer and non-cancer samples.

Conclusions: Our model provides an explanation to its predictions by identifying the most important neurons and associating them with biological functions, making the model understandable for biologists and physicians.

Keywords: Deep learning; Gene Ontology; Gene expression; Model interpretation; Phenotype prediction.

MeSH terms

Gene Expression
Gene Ontology
Humans
Neoplasms*
Neural Networks, Computer*
Phenotype