Basic Introduction to Statistics in Medicine, Part 2: Comparing Data

Wyatt P Bensken; Vanessa P Ho; Fredric M Pieracci

doi:10.1089/sur.2020.430

Basic Introduction to Statistics in Medicine, Part 2: Comparing Data

Surg Infect (Larchmt). 2021 Aug;22(6):597-603. doi: 10.1089/sur.2020.430.

Authors

Wyatt P Bensken¹, Vanessa P Ho^{1

2}, Fredric M Pieracci³

Affiliations

¹ Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA.
² Department of Surgery, MetroHealth Medical Center, Cleveland, Ohio, USA.
³ Department of Surgery, Denver Health Medical Center, Denver, Colorado, USA.

Abstract

Background: Comparison of parameters between two or more groups forms the basis of hypothesis testing. Statistical tests (and statistical significance) are designed to report the likelihood the observed results are caused by chance alone, given that the null hypothesis is true. Methods: To demonstrate the concepts described, we utilized the Nationwide Inpatient Sample for patients admitted for emergency general surgery (EGS) and those admitted with non-EGS diagnoses. Depending on the type and distribution of individual variables, appropriate statistical tests were applied. Results: Comparison of numerical variables between two groups is begun with a simple correlation, depicted graphically in a scatterplot, and assessed statistically with either a Pearson or Spearman correlation coefficient. Normality of numerical variables is then assessed and in the case of normality, a t-test is applied when comparing two groups, and an analysis of variance (ANOVA) when comparing three or more groups. For data that are not distributed normally, a Wilcoxon rank sum (Mann-Whitney U) test may be used. For categorical variables, the χ² test is used, unless cell counts are less than five, in which case the Fisher exact test is used. Importantly, both the ANOVA and χ² test are used to assess for overall differences between two or more groups. Individual pair comparison tests, as well as adjusting for multiple comparisons must be used to identify differences between two specific groups when there are more than two groups. Conclusion: A basic understanding of statistical significance, and the type and distribution of variables is necessary to select the appropriate statistical test to compare data. Failure to understand these concepts may result in spurious conclusions.

Keywords: comparing data; statistical tests; statistics.

MeSH terms

Analysis of Variance*
Humans
Statistics, Nonparametric*

Grants and funding

KL2 TR002547/TR/NCATS NIH HHS/United States