A new approach to identifying patients with elevated risk for Fabry disease using a machine learning algorithm

Orphanet J Rare Dis. 2021 Dec 20;16(1):518. doi: 10.1186/s13023-021-02150-3.


Background: Fabry disease (FD) is a rare genetic disorder characterized by glycosphingolipid accumulation and progressive damage across multiple organ systems. Due to its heterogeneous presentation, the condition is likely significantly underdiagnosed. Several approaches, including provider education efforts and newborn screening, have attempted to address underdiagnosis of FD across the age spectrum, with limited success. Artificial intelligence (AI) methods present another option for improving diagnosis. These methods isolate common health history patterns among patients using longitudinal real-world data, and can be particularly useful when patients experience nonspecific, heterogeneous symptoms over time. In this study, the performance of an AI tool in identifying patients with FD was analyzed. The tool was calibrated using de-identified health record data from a large cohort of nearly 5000 FD patients, and extracted phenotypic patterns from these records. The tool then used this FD pattern information to make individual-level estimates of FD in a testing dataset. Patterns were reviewed and confirmed with medical experts.

Results: The AI tool demonstrated strong analytic performance in identifying FD patients. In out-of-sample testing, it achieved an area under the receiver operating characteristic curve (AUROC) of 0.82. Strong performance was maintained when testing on male-only and female-only cohorts, with AUROCs of 0.83 and 0.82 respectively. The tool identified small segments of the population with greatly increased prevalence of FD: in the 1% of the population identified by the tool as at highest risk, FD was 23.9 times more prevalent than in the population overall. The AI algorithm used hundreds of phenotypic signals to make predictions and included both familiar symptoms associated with FD (e.g. renal manifestations) as well as less well-studied characteristics.

Conclusions: The AI tool analyzed in this study performed very well in identifying Fabry disease patients using structured medical history data. Performance was maintained in all-male and all-female cohorts, and the phenotypic manifestations of FD highlighted by the tool were reviewed and confirmed by clinical experts in the condition. The platform's analytic performance, transparency, and ability to generate predictions based on existing real-world health data may allow it to contribute to reducing persistent underdiagnosis of Fabry disease.

Keywords: AI; Fabry disease; Patient identification; Phenotypic biomarker.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Fabry Disease* / genetics
  • Female
  • Humans
  • Infant, Newborn
  • Kidney
  • Machine Learning
  • Male