Purpose: Roughly 70% of suspected Mendelian disease patients remain undiagnosed after genome sequencing, partly because knowledge about pathogenic genes is incomplete and constantly growing. Generating a novel pathogenic gene hypothesis from patient data can be time-consuming especially where cohort-based analysis is not available.
Methods: Each patient genome contains dozens to hundreds of candidate variants. Many sources of indirect evidence about each candidate may be considered. We introduce InpherNet, a network-based machine learning approach leveraging Monarch Initiative data to accelerate this process.
Results: InpherNet ranks candidate genes based on orthologs, paralogs, functional pathway members, and colocalized interaction partner gene neighbors. It can propose novel pathogenic genes and reveal known pathogenic genes whose diagnosed patient-based annotation is missing or partial. InpherNet is applied to patient cases where the causative gene is incorrectly ranked low by clinical gene-ranking methods that use only patient-derived evidence. InpherNet correctly ranks the causative gene top 1 or top 1-5 in roughly twice as many cases as seven comparable tools, including in cases where no clinical evidence for the diagnostic gene is in our knowledgebase.
Conclusion: InpherNet improves the state of the art in considering candidate gene neighbors to accelerate monogenic diagnosis.
© 2021. The Author(s), under exclusive licence to the American College of Medical Genetics and Genomics.