Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates.
Keywords: categorical variable; continuous variable; exposure modelling; linearity; regression.
© The Author 2013. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.