Pisces did not have increased heart failure: data-driven comparisons of binary proportions between levels of a categorical variable can result in incorrect statistical significance levels

J Clin Epidemiol. 2008 Mar;61(3):295-300. doi: 10.1016/j.jclinepi.2007.05.007. Epub 2007 Sep 24.

Abstract

Objective: We examined the impact on statistical inference when a chi(2) test is used to compare the proportion of successes in the level of a categorical variable that has the highest observed proportion of successes with the proportion of successes in all other levels of the categorical variable combined.

Study design and setting: Monte Carlo simulations and a case study examining the association between astrological sign and hospitalization for heart failure.

Results: A standard chi(2) test results in an inflation of the type I error rate, with the type I error rate increasing as the number of levels of the categorical variable increases. Using a standard chi(2) test, the hospitalization rate for Pisces was statistically significantly different from that of the other 11 astrological signs combined (P=0.026). After accounting for the fact that the selection of Pisces was based on it having the highest observed proportion of heart failure hospitalizations, subjects born under the sign of Pisces no longer had a significantly higher rate of heart failure hospitalization compared to the other residents of Ontario (P=0.152).

Conclusions: Post hoc comparisons of the proportions of successes across different levels of a categorical variable can result in incorrect inferences.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Astrology*
  • Confounding Factors, Epidemiologic
  • Epidemiologic Methods
  • Heart Failure / epidemiology*
  • Heart Failure / etiology
  • Hospitalization / statistics & numerical data
  • Humans
  • Ontario / epidemiology