Parameterizing Time in Electronic Health Record Studies

J Am Med Inform Assoc. 2015 Jul;22(4):794-804. doi: 10.1093/jamia/ocu051. Epub 2015 Feb 26.


Background: Fields like nonlinear physics offer methods for analyzing time series, but many methods require that the time series be stationary-no change in properties over time.Objective Medicine is far from stationary, but the challenge may be able to be ameliorated by reparameterizing time because clinicians tend to measure patients more frequently when they are ill and are more likely to vary.

Methods: We compared time parameterizations, measuring variability of rate of change and magnitude of change, and looking for homogeneity of bins of temporal separation between pairs of time points. We studied four common laboratory tests drawn from 25 years of electronic health records on 4 million patients.

Results: We found that sequence time-that is, simply counting the number of measurements from some start-produced more stationary time series, better explained the variation in values, and had more homogeneous bins than either traditional clock time or a recently proposed intermediate parameterization. Sequence time produced more accurate predictions in a single Gaussian process model experiment.

Conclusions: Of the three parameterizations, sequence time appeared to produce the most stationary series, possibly because clinicians adjust their sampling to the acuity of the patient. Parameterizing by sequence time may be applicable to association and clustering experiments on electronic health record data. A limitation of this study is that laboratory data were derived from only one institution. Sequence time appears to be an important potential parameterization.

Keywords: data mining; electronic health record; parameterization; phenotype; time; time series.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Data Mining
  • Electronic Health Records*
  • Humans
  • Models, Theoretical
  • Time*