Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Mar-Apr;21(2):272-9.
doi: 10.1136/amiajnl-2013-002151. Epub 2013 Sep 27.

Mining high-dimensional administrative claims data to predict early hospital readmissions

Affiliations

Mining high-dimensional administrative claims data to predict early hospital readmissions

Danning He et al. J Am Med Inform Assoc. 2014 Mar-Apr.

Abstract

Background: Current readmission models use administrative data supplemented with clinical information. However, the majority of these result in poor predictive performance (area under the curve (AUC)<0.70).

Objective: To develop an administrative claim-based algorithm to predict 30-day readmission using standardized billing codes and basic admission characteristics available before discharge.

Materials and methods: The algorithm works by exploiting high-dimensional information in administrative claims data and automatically selecting empirical risk factors. We applied the algorithm to index admissions in two types of hospitalized patient: (1) medical patients and (2) patients with chronic pancreatitis (CP). We trained the models on 26,091 medical admissions and 3218 CP admissions from The Johns Hopkins Hospital (a tertiary research medical center) and tested them on 16,194 medical admissions and 706 CP admissions from Johns Hopkins Bayview Medical Center (a hospital that serves a more general patient population), and vice versa. Performance metrics included AUC, sensitivity, specificity, positive predictive values, negative predictive values, and F-measure.

Results: From a pool of up to 5665 International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) diagnoses, 599 ICD-9-CM procedures, and 1815 Current Procedural Terminology codes observed, the algorithm learned a model consisting of 18 attributes from the medical patient cohort and five attributes from the CP cohort. Within-site and across-site validations had an AUC≥0.75 for the medical patient cohort and an AUC≥0.65 for the CP cohort.

Conclusions: We have created an algorithm that is widely applicable to various patient cohorts and portable across institutions. The algorithm performed similarly to state-of-the-art readmission models that require clinical data.

Keywords: administrative claims data; algorithm; portability; predictive modelling; readmission; sensitivity and specificity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Evaluation workflow. The figure shows the model learning on the entire Johns Hopkins Hospital (JHH) cohort and tested on the entire Bayview Medical Center (BMC) cohort. Training on BMC and testing on JHH were performed in a similar fashion (not pictured).
Figure 2
Figure 2
Overview of the algorithm. The flow chart outlines the five steps in the algorithm, which can be classified as branch and bound. CPT, Current Procedural Terminology; ICD9, International Classification of Diseases, 9th Revision.
Figure 3
Figure 3
Receiver operating characteristic curves for across-site analyses on medical patient (ME) cohort and chronic pancreatitis (CP) cohort. (A) Training ME cohort on Johns Hopkins Hospital (JHH) and testing on Bayview Medical Center (BMC) (area under the curve (AUC)=0.81). (B) Training ME cohort on BMC and testing on JHH (AUC=0.78). (C) Training CP cohort on JHH and testing on BMC (AUC=0.65). (D) Training CP cohort on BMC and testing on JHH (AUC=0.73). Line color: threshold cutoff.
Figure 4
Figure 4
Correlation matrix for the 37 attributes in Johns Hopkins Hospital chronic pancreatitis (CP) cohort after univariate variable selection. The matrix represents the set of attributes highly correlated with outcome. The final model contains five attributes that independently contribute to predicting outcome. Darker color corresponds to higher correlation, and lighter color corresponds to lower correlation. Diagonal entries represent self-correlation, which is always 1.0.

Similar articles

Cited by

References

    1. Ferver K, Burton B, Jesilow P. The use of claims data in healthcare research. Open Public Health J 2009;2:11–24
    1. Kansagara D, Englander H, Salanitro A, et al. Risk prediction models for hospital readmission. JAMA 2011;306:1688–98 - PMC - PubMed
    1. Halfon P, Eggli Y, Prêtre-Rohrbach I, et al. Validation of the potentially avoidable hospital readmission rate as a routine indicator of the quality of hospital care. Med Care 2006;44:972–81 - PubMed
    1. Bottle A, Aylin P, Majeed A. Identifying patients at high risk of emergency hospital admissions: a logistic regression analysis. JRSM 2006;99:406–14 - PMC - PubMed
    1. Howell S, Coory M, Martin J, et al. Using routine inpatient data to identify patients at risk of hospital readmission. BMC Health Serv Res 2009;9:96. - PMC - PubMed