PheSeq, a Bayesian deep learning model to enhance and interpret the gene-disease association studies

Genome Med. 2024 Apr 16;16(1):56. doi: 10.1186/s13073-024-01330-7.

Abstract

Despite the abundance of genotype-phenotype association studies, the resulting association outcomes often lack robustness and interpretations. To address these challenges, we introduce PheSeq, a Bayesian deep learning model that enhances and interprets association studies through the integration and perception of phenotype descriptions. By implementing the PheSeq model in three case studies on Alzheimer's disease, breast cancer, and lung cancer, we identify 1024 priority genes for Alzheimer's disease and 818 and 566 genes for breast cancer and lung cancer, respectively. Benefiting from data fusion, these findings represent moderate positive rates, high recall rates, and interpretation in gene-disease association studies.

Keywords: p-value; Associated genes; Data fusion; Embedding data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alzheimer Disease* / genetics
  • Bayes Theorem
  • Breast Neoplasms* / genetics
  • Deep Learning*
  • Female
  • Genetic Association Studies
  • Humans
  • Lung Neoplasms*