Detecting clinically meaningful biomarkers with repeated measurements: An illustration with electronic health records

Biometrics. 2015 Jun;71(2):478-86. doi: 10.1111/biom.12283. Epub 2015 Feb 4.


Data sources with repeated measurements are an appealing resource to understand the relationship between changes in biological markers and risk of a clinical event. While longitudinal data present opportunities to observe changing risk over time, these analyses can be complicated if the measurement of clinical metrics is sparse and/or irregular, making typical statistical methods unsuitable. In this article, we use electronic health record (EHR) data as an example to present an analytic procedure to both create an analytic sample and analyze the data to detect clinically meaningful markers of acute myocardial infarction (MI). Using an EHR from a large national dialysis organization we abstracted the records of 64,318 individuals and identified 4769 people that had an MI during the study period. We describe a nested case-control design to sample appropriate controls and an analytic approach using regression splines. Fitting a mixed-model with truncated power splines we perform a series of goodness-of-fit tests to determine whether any of 11 regularly collected laboratory markers are useful clinical predictors. We test the clinical utility of each marker using an independent test set. The results suggest that EHR data can be easily used to detect markers of clinically acute events. Special software or analytic tools are not needed, even with irregular EHR data.

Keywords: Biological markers; Dialysis; Longitudinal data; Myocardial infarction; Risk prediction; Splines.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biomarkers / analysis*
  • Biometry
  • Case-Control Studies
  • Electronic Health Records / statistics & numerical data*
  • Humans
  • Models, Statistical
  • Myocardial Infarction / diagnosis
  • Myocardial Infarction / metabolism
  • Predictive Value of Tests
  • Regression Analysis
  • Renal Dialysis


  • Biomarkers