Alzheimer's disease knowledge graph enhances knowledge discovery and disease prediction

Comput Biol Med. 2025 Jun;192(Pt A):110285. doi: 10.1016/j.compbiomed.2025.110285. Epub 2025 Apr 29.

Abstract

Objective: To construct an Alzheimer's Disease Knowledge Graph (ADKG) by extracting and integrating relationships among Alzheimer's disease (AD), genes, variants, chemicals, drugs, and other diseases from biomedical literature, aiming to identify existing treatments, potential targets, and diagnostic methods for AD.

Methods: We annotated 800 PubMed abstracts (ADERC corpus) with 20,886 entities and 4935 relationships, augmented via GPT-4. A SpERT model (SciBERT-based) trained on this data extracted relations from PubMed abstracts, supported by biomedical databases and entity linking refined via abbreviation resolution/string matching. The resulting knowledge graph trained embedding models to predict novel relationships. ADKG's utility was validated by integrating it with UK Biobank data for predictive modeling.

Results: The ADKG contained 3,199,276 entity mentions and 633,733 triplets, linking >5K unique entities and capturing complex AD-related interactions. Its graph embedding models produced evidence-supported predictions, enabling testable hypotheses. In UK Biobank predictive modeling, ADKG-enhanced models achieved higher AUROC of 0.928 comparing to 0.903 without ADKG enhancement.

Conclusion: By synthesizing literature-derived insights into a computable framework, ADKG bridges molecular mechanisms to clinical phenotypes, advancing precision medicine in Alzheimer's research. Its structured data and predictive utility underscore its potential to accelerate therapeutic discovery and risk stratification.

Keywords: Alzheimer's disease; Disease prediction; Knowledge graph construction; Link prediction.

MeSH terms

  • Alzheimer Disease* / diagnosis
  • Alzheimer Disease* / genetics
  • Alzheimer Disease* / metabolism
  • Computational Biology* / methods
  • Databases, Factual
  • Humans
  • Knowledge Discovery*