At-risk-measure Sampling in Case-Control Studies with Aggregated Data

Epidemiology. 2021 Jan;32(1):101-110. doi: 10.1097/EDE.0000000000001268.


Transient exposures are difficult to measure in epidemiologic studies, especially when both the status of being at risk for an outcome and the exposure change over time and space, as when measuring built-environment risk on transportation injury. Contemporary "big data" generated by mobile sensors can improve measurement of transient exposures. Exposure information generated by these devices typically only samples the experience of the target cohort, so a case-control framework may be useful. However, for anonymity, the data may not be available by individual, precluding a case-crossover approach. We present a method called at-risk-measure sampling. Its goal is to estimate the denominator of an incidence rate ratio (exposed to unexposed measure of the at-risk experience) given an aggregated summary of the at-risk measure from a cohort. Rather than sampling individuals or locations, the method samples the measure of the at-risk experience. Specifically, the method as presented samples person-distance and person-events summarized by location. It is illustrated with data from a mobile app used to record bicycling. The method extends an established case-control sampling principle: sample the at-risk experience of a cohort study such that the sampled exposure distribution approximates that of the cohort. It is distinct from density sampling in that the sample remains in the form of the at-risk measure, which may be continuous, such as person-time or person-distance. This aspect may be both logistically and statistically efficient if such a sample is already available, for example from big-data sources like aggregated mobile-sensor data.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Case-Control Studies
  • Cohort Studies*
  • Humans
  • Incidence