Assess and validate predictive performance of models for in-hospital mortality in COVID-19 patients: A retrospective cohort study in the Netherlands comparing the value of registry data with high-granular electronic health records

Int J Med Inform. 2022 Nov:167:104863. doi: 10.1016/j.ijmedinf.2022.104863. Epub 2022 Sep 22.


Purpose: To assess, validate and compare the predictive performance of models for in-hospital mortality of COVID-19 patients admitted to the intensive care unit (ICU) over two different waves of infections. Our models were built with high-granular Electronic Health Records (EHR) data versus less-granular registry data.

Methods: Observational study of all COVID-19 patients admitted to 19 Dutch ICUs participating in both the national quality registry National Intensive Care Evaluation (NICE) and the EHR-based Dutch Data Warehouse (hereafter EHR). Multiple models were developed on data from the first 24 h of ICU admissions from February to June 2020 (first COVID-19 wave) and validated on prospective patients admitted to the same ICUs between July and December 2020 (second COVID-19 wave). We assessed model discrimination, calibration, and the degree of relatedness between development and validation population. Coefficients were used to identify relevant risk factors.

Results: A total of 1533 patients from the EHR and 1563 from the registry were included. With high granular EHR data, the average AUROC was 0.69 (standard deviation of 0.05) for the internal validation, and the AUROC was 0.75 for the temporal validation. The registry model achieved an average AUROC of 0.76 (standard deviation of 0.05) in the internal validation and 0.77 in the temporal validation. In the EHR data, age, and respiratory-system related variables were the most important risk factors identified. In the NICE registry data, age and chronic respiratory insufficiency were the most important risk factors.

Conclusion: In our study, prognostic models built on less-granular but readily-available registry data had similar performance to models built on high-granular EHR data and showed similar transportability to a prospective COVID-19 population. Future research is needed to verify whether this finding can be confirmed for upcoming waves.

Keywords: Covid-19 [C01.748.610.763.500]; Critical care [E02.760.190]; Electronic Health Record [E05.318.308.940.968.625.500]; In-hospital mortality [E05.318.308.985.550.400]; Machine learning [G17.035.250.500]; Prognosis [E01.789].

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / epidemiology
  • Electronic Health Records
  • Hospital Mortality
  • Humans
  • Intensive Care Units
  • Netherlands / epidemiology
  • Registries
  • Retrospective Studies