Developing a synthetic control group using electronic health records: Application to a single-arm lifestyle intervention study

Prev Med Rep. 2021 Oct 4;24:101572. doi: 10.1016/j.pmedr.2021.101572. eCollection 2021 Dec.


The electronic health records (EHR) infrastructure offers a tremendous resource for identifying controls who match the characteristics of study participants in a single-arm trial. The objectives are to (1) demonstrate the feasibility of curating a synthetic control group for an existing study cohort through EHR data extraction and (2) evaluate the effect of a lifestyle intervention on selected cardiovascular health metrics. A total of 711 university employees were recruited between 2008 and 2012 to participate in a health partner intervention to improve cardiovascular health and were followed for five years. Data of nearly 8000 eligible subjects were extracted from the EHR to create a synthetic control cohort during the same study period. To minimize confounding, crude comparison, exact matching, propensity score matching, and doubly robust estimation were used to compare the selected cardiovascular health metrics at 1 and 5 years of follow-up. Blood pressure and body mass index improved in the intervention group compared to the EHR synthetic controls. The findings of changes in lipid measurements were somewhat unexpected. When analyzing the subgroup without lipid-lowering medications, the intervention group exhibited better control of cholesterol levels over time than did our synthetic controls. Some measurements in the EHR system may be more robust for synthetic selection than others. EHR synthetic controls can provide an alternative to estimate intervention effects appropriately in single-arm studies for these measurements.

Keywords: Controlled trials; Doubly robust; Electronic medical record; Pseudo control.