Missing data assumptions and methods in a smoking cessation study

Addiction. 2010 Mar;105(3):431-7. doi: 10.1111/j.1360-0443.2009.02809.x.


Aim: A sizable percentage of subjects do not respond to follow-up attempts in smoking cessation studies. The usual procedure in the smoking cessation literature is to assume that non-respondents have resumed smoking. This study used data from a study with a high follow-up rate to assess the degree of bias that may be caused by different methods of imputing missing data.

Design and methods: Based on a large data set with very little missing follow-up information at 12 months, a simulation study was undertaken to compare and contrast missing data imputation methods (assuming smoking, propensity score matching and optimal matching) under various assumptions as to how the missing data arose (randomly generated missing values, increased non-response from smokers and a hybrid of the two).

Findings: Missing data imputation methods all resulted in some degree of bias which increased with the amount of missing data.

Conclusion: None of the missing data imputation methods currently available can compensate for bias when there are substantial amounts of missing data.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bias
  • Clinical Trials as Topic / standards*
  • Data Interpretation, Statistical
  • Follow-Up Studies
  • Humans
  • Smoking Cessation / statistics & numerical data*