Using "big data" to capture overall health status: properties and predictive value of a claims-based health risk score

PLoS One. 2015 May 7;10(5):e0126054. doi: 10.1371/journal.pone.0126054. eCollection 2015.

Abstract

Background: Investigators across many fields often struggle with how best to capture an individual's overall health status, with options including both subjective and objective measures. With the increasing availability of "big data," researchers can now take advantage of novel metrics of health status. These predictive algorithms were initially developed to forecast and manage expenditures, yet they represent an underutilized tool that could contribute significantly to health research. In this paper, we describe the properties and possible applications of one such "health risk score," the DxCG Intelligence tool.

Methods: We link claims and administrative datasets on a cohort of U.S. workers during the period 1996-2011 (N = 14,161). We examine the risk score's association with incident diagnoses of five disease conditions, and we link employee data with the National Death Index to characterize its relationship with mortality. We review prior studies documenting the risk score's association with other health and non-health outcomes, including healthcare utilization, early retirement, and occupational injury.

Results and conclusions: We find that the risk score is associated with outcomes across a variety of health and non-health domains. These examples demonstrate the broad applicability of this tool in multiple fields of research and illustrate its utility as a measure of overall health status for epidemiologists and other health researchers.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cohort Studies
  • Health Status*
  • Humans
  • Insurance Claim Review*
  • Risk