Correction for misclassification of a categorized exposure in binary regression using replication data

Stat Med. 2009 Nov 30;28(27):3386-410. doi: 10.1002/sim.3712.

Abstract

Continuous epidemiologic exposure data are often categorized according to one or more cut points before inclusion in a regression analysis involving some outcome variable. If the original data are subject to measurement error, the categorized data will be afflicted with misclassification, which is differential, and which induces biases in naïve methods that ignore the misclassification. We propose a method for measurement error adjustment in these settings, when there are replicate data available on the original measurements, and when the outcome variable is dichotomous. Working on the continuous measurements, conditional densities of the exposure given the outcome are estimated and used to obtain odds ratios. The estimation of densities is done either parametrically or nonparametrically. The method is compared with the naïve approach of simply categorizing the erroneous mean measurements in simulation studies, and although the nonparametric method is more variable, it has the best overall performance, the greatest differences being observed in settings where the effects and/or the measurement errors are large. The performance of the parametric method is highly dependent on the model fit. Applying the methods to a real-life data set from the Framingham Heart Study produced larger estimated odds ratios for coronary heart disease as a result of elevated systolic blood pressure, as compared with naïve odds ratios. We provide some discussion of alternative procedures that might be considered including regression calibration, SIMEX and the use of estimated misclassification probabilities.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Blood Pressure / physiology
  • Computer Simulation*
  • Coronary Disease / physiopathology
  • Humans
  • Male
  • Middle Aged
  • Models, Statistical*
  • Odds Ratio*
  • Regression Analysis*