A general method for dealing with misclassification in regression: the misclassification SIMEX

Biometrics. 2006 Mar;62(1):85-96. doi: 10.1111/j.1541-0420.2005.00396.x.


We have developed a new general approach for handling misclassification in discrete covariates or responses in regression models. The simulation and extrapolation (SIMEX) method, which was originally designed for handling additive covariate measurement error, is applied to the case of misclassification. The statistical model for characterizing misclassification is given by the transition matrix Pi from the true to the observed variable. We exploit the relationship between the size of misclassification and bias in estimating the parameters of interest. Assuming that Pi is known or can be estimated from validation data, we simulate data with higher misclassification and extrapolate back to the case of no misclassification. We show that our method is quite general and applicable to models with misclassified response and/or misclassified discrete regressors. In the case of a binary response with misclassification, we compare our method to the approach of Neuhaus, and to the matrix method of Morrissey and Spiegelman in the case of a misclassified binary regressor. We apply our method to a study on caries with a misclassified longitudinal response.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias*
  • Biometry
  • Humans
  • Linear Models*
  • Longitudinal Studies
  • Methods
  • Models, Theoretical