Organizing and analyzing the activity data in NHANES

Stat Biosci. 2019 Jul;11(2):262-287. doi: 10.1007/s12561-018-09229-9. Epub 2019 Feb 9.


The NHANES study contains objectively measured physical activity data collected using hip-worn accelerometers from multiple cohorts. However, using the accelerometry data has proven daunting because: 1) currently, there are no agreed upon standard protocols for data storage and analysis; 2) data exhibit heterogeneous patterns of missingness due to varying degrees of adherence to wear-time protocols; 3) sampling weights need to be carefully adjusted and accounted for in individual analyses; 4) there is a lack of reproducible software that transforms the data from its published format into analytic form; and 5) the high dimensional nature of accelerometry data complicates analyses. Here, we provide a framework for processing, storing, and analyzing the NHANES accelerometry data for the 2003-2004 and 2005-2006 surveys. We also provide an NHANES data package in R, to help disseminate high quality, processed activity data combined with mortality and demographic information. Thus, we provide the tools to transition from "available data online" to "easily accessible and usable data", which substantially reduces the large upfront costs of initiating studies of association between physical activity and human health outcomes using NHANES. We apply these tools in an analysis showing that accelerometry features have the potential to predict 5-year all cause mortality better than known risk factors such as age, cigarette smoking, and various comorbidities.

Keywords: Accelerometry; NHANES; Phyiscal Activity; Prediction.