Multiple imputation of missing data with skip-pattern covariates: a comparison of alternative strategies

J Stat Comput Simul. 2023;94(7):1543-1570. doi: 10.1080/00949655.2023.2293124.

Abstract

Multiple imputation (MI) is a widely used approach to address missing data issues in surveys. Variables included in MI can have various distributional forms with different degrees of missingness. However, when variables with missing data contain skip patterns (i.e. questions not applicable to some survey participants are thus skipped), implementation of MI may not be straightforward. In this research, we compare two approaches for MI when skip-pattern covariates with missing values exist. One approach imputes missing values in the skip-pattern variables only among applicable subjects (denoted as imputation among applicable cases (IAAC)). The second approach imputes skip-pattern covariates among all subjects while using different recoding methods on the skip-pattern variables (denoted as imputation with recoded non-applicable cases (IWRNC)). A simulation study is conducted to compare these methods. Both approaches are applied to the 2015 and 2016 Research and Development Survey data from the National Center for Health Statistics.

Keywords: Multiple imputation; RANDS survey; missing skip-pattern variables.