Structured reviews for data and knowledge-driven research

Database (Oxford). 2020 Jan 1:2020:baaa015. doi: 10.1093/database/baaa015.


Hypothesis generation is a critical step in research and a cornerstone in the rare disease field. Research is most efficient when those hypotheses are based on the entirety of knowledge known to date. Systematic review articles are commonly used in biomedicine to summarize existing knowledge and contextualize experimental data. But the information contained within review articles is typically only expressed as free-text, which is difficult to use computationally. Researchers struggle to navigate, collect and remix prior knowledge as it is scattered in several silos without seamless integration and access. This lack of a structured information framework hinders research by both experimental and computational scientists. To better organize knowledge and data, we built a structured review article that is specifically focused on NGLY1 Deficiency, an ultra-rare genetic disease first reported in 2012. We represented this structured review as a knowledge graph and then stored this knowledge graph in a Neo4j database to simplify dissemination, querying and visualization of the network. Relative to free-text, this structured review better promotes the principles of findability, accessibility, interoperability and reusability (FAIR). In collaboration with domain experts in NGLY1 Deficiency, we demonstrate how this resource can improve the efficiency and comprehensiveness of hypothesis generation. We also developed a read-write interface that allows domain experts to contribute FAIR structured knowledge to this community resource. In contrast to traditional free-text review articles, this structured review exists as a living knowledge graph that is curated by humans and accessible to computational analyses. Finally, we have generalized this workflow into modular and repurposable components that can be applied to other domain areas. This NGLY1 Deficiency-focused network is publicly available at

Availability and implementation: Database URL: Network data files are at: and source code at:


Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Biomedical Research / methods*
  • Biomedical Research / statistics & numerical data
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Congenital Disorders of Glycosylation / genetics
  • Congenital Disorders of Glycosylation / metabolism
  • Data Curation / methods
  • Data Mining / methods
  • Databases, Factual*
  • Humans
  • Internet
  • Knowledge Bases*
  • Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase / deficiency
  • Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase / genetics
  • Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase / metabolism
  • Systematic Reviews as Topic


  • Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase

Supplementary concepts

  • NGLY1 deficiency