Objectives: Direct electronic access to multiple electronic health record (EHR) systems through patient portals offers a novel avenue for decentralized research. Given the critical value of patient characterization, we sought to compare computable evaluation of health conditions from patient-portal EHR against the traditional self-report.
Materials and methods: In the nationwide Innovative Support for Patients with SARS-CoV-2 Infections Registry (INSPIRE) study, which linked self-reported questionnaires with multiplatform patient-portal EHR data, we compared self-reported health conditions across different clinical domains against computable definitions based on diagnosis codes, medications, vital signs, and laboratory testing. We assessed their concordance using Cohen's Kappa and the prognostic significance of differentially captured features as predictors of 1-year all-cause hospitalization risk.
Results: Among 1683 participants (mean age 41 ± 15 years, 67% female, 63% non-Hispanic Whites), the prevalence of conditions varied substantially between EHR and self-report (-13.2% to +11.6% across definitions). Compared with comprehensive EHR phenotypes, self-report under-captured all conditions, including hypertension (27.9% vs 16.2%), diabetes (10.1% vs 6.2%), and heart disease (8.5% vs 4.3%). However, diagnosis codes alone were insufficient. The risk for 1-year hospitalization was better defined by the same features from patient-portal EHR (area under the receiver operating curve [AUROC] 0.79) than from self-report (AUROC 0.68).
Discussion: EHR-derived computable phenotypes identified a higher prevalence of comorbidities than self-report, with prognostic value of additionally identified features. However, definitions based solely on diagnosis codes often undercaptured self-reported conditions, suggesting a role of broader EHR elements.
Conclusion: In this nationwide study, patient-portal-derived EHR data enabled extensive capture of patient characteristics across multiple EHR platforms, allowing better disease phenotyping compared with self-report.
Keywords: decentralized; multicenter; patient portal; pragmatic studies.
© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.