Purpose: To review the principles of multivariable analysis and to examine the application of multivariable statistical methods in general medical literature.
Data sources: A computer-assisted search of articles in The Lancet and The New England Journal of Medicine identified 451 publications containing multivariable methods from 1985 through 1989. A random sample of 60 articles that used the two most common methods--logistic regression or proportional hazards analysis--was selected for more intensive review.
Data extraction: During review of the 60 randomly selected articles, the focus was on generally accepted methodologic guidelines that can prevent problems affecting the accuracy and interpretation of multivariable analytic results.
Results: From 1985 to 1989, the relative frequency of multivariable statistical methods increased annually from about 10% to 18% among all articles in the two journals. In 44 (73%) of 60 articles using logistic or proportional hazards regression, risk estimates were quantified for individual variables ("risk factors"). Violations and omissions of methodologic guidelines in these 44 articles included overfitting of data; no test of conformity of variables to a linear gradient; no mention of pertinent checks for proportional hazards; no report of testing for interactions between independent variables; and unspecified coding or selection of independent variables. These problems would make the reported results potentially inaccurate, misleading, or difficult to interpret.
Conclusions: The findings suggest a need for improvement in the reporting and perhaps conducting of multivariable analyses in medical research.