Predictive Analytics for Glaucoma Using Data From the All of Us Research Program

Am J Ophthalmol. 2021 Jul:227:74-86. doi: 10.1016/j.ajo.2021.01.008. Epub 2021 Jan 23.

Abstract

Purpose: To (1) use All of Us (AoU) data to validate a previously published single-center model predicting the need for surgery among individuals with glaucoma, (2) train new models using AoU data, and (3) share insights regarding this novel data source for ophthalmic research.

Design: Development and evaluation of machine learning models.

Methods: Electronic health record data were extracted from AoU for 1,231 adults diagnosed with primary open-angle glaucoma. The single-center model was applied to AoU data for external validation. AoU data were then used to train new models for predicting the need for glaucoma surgery using multivariable logistic regression, artificial neural networks, and random forests. Five-fold cross-validation was performed. Model performance was evaluated based on area under the receiver operating characteristic curve (AUC), accuracy, precision, and recall.

Results: The mean (standard deviation) age of the AoU cohort was 69.1 (10.5) years, with 57.3% women and 33.5% black, significantly exceeding representation in the single-center cohort (P = .04 and P < .001, respectively). Of 1,231 participants, 286 (23.2%) needed glaucoma surgery. When applying the single-center model to AoU data, accuracy was 0.69 and AUC was only 0.49. Using AoU data to train new models resulted in superior performance: AUCs ranged from 0.80 (logistic regression) to 0.99 (random forests).

Conclusions: Models trained with national AoU data achieved superior performance compared with using single-center data. Although AoU does not currently include ophthalmic imaging, it offers several strengths over similar big-data sources such as claims data. AoU is a promising new data source for ophthalmic research.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Aged
  • Aged, 80 and over
  • Databases, Factual / statistics & numerical data*
  • Electronic Health Records / statistics & numerical data*
  • Female
  • Filtering Surgery / methods*
  • Glaucoma, Open-Angle / diagnosis*
  • Glaucoma, Open-Angle / surgery*
  • Humans
  • Information Storage and Retrieval / methods
  • Logistic Models
  • Machine Learning
  • Male
  • Middle Aged
  • Models, Statistical
  • Neural Networks, Computer
  • ROC Curve