Statistical interactions and Bayes estimation of log odds in case-control studies

Stat Methods Med Res. 2017 Apr;26(2):1021-1038. doi: 10.1177/0962280214567140. Epub 2015 Jan 12.

Abstract

This paper is concerned with the estimation of the logarithm of disease odds (log odds) when evaluating two risk factors, whether or not interactions are present. Statisticians define interaction as a departure from an additive model on a certain scale of measurement of the outcome. Certain interactions, known as removable interactions, may be eliminated by fitting an additive model under an invertible transformation of the outcome. This can potentially provide more precise estimates of log odds than fitting a model with interaction terms. In practice, we may also encounter nonremovable interactions. The model must then include interaction terms, regardless of the choice of the scale of the outcome. However, in practical settings, we do not know at the outset whether an interaction exists, and if so whether it is removable or nonremovable. Rather than trying to decide on significance levels to test for the existence of removable and nonremovable interactions, we develop a Bayes estimator based on a squared error loss function. We demonstrate the favorable bias-variance trade-offs of our approach using simulations, and provide empirical illustrations using data from three published endometrial cancer case-control studies. The methods are implemented in an R program, and available freely at http://www.mskcc.org/biostatistics/~satagopj .

Keywords: Bayes estimator; compositional epistasis; logistic link; mean squared error; minimax estimator; nonremovable interaction; removable interaction; transformation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem*
  • Biostatistics / methods
  • Case-Control Studies
  • Computer Simulation
  • Data Interpretation, Statistical
  • Endometrial Neoplasms / etiology
  • Endometrial Neoplasms / genetics
  • Female
  • Humans
  • Linear Models
  • Logistic Models
  • Models, Statistical*
  • Risk Factors