Impact of Selection Bias on Estimation of Subsequent Event Risk

Circ Cardiovasc Genet. 2017 Oct;10(5):e001616. doi: 10.1161/CIRCGENETICS.116.001616.

Abstract

Background: Studies of recurrent or subsequent disease events may be susceptible to bias caused by selection of subjects who both experience and survive the primary indexing event. Currently, the magnitude of any selection bias, particularly for subsequent time-to-event analysis in genetic association studies, is unknown.

Methods and results: We used empirically inspired simulation studies to explore the impact of selection bias on the marginal hazard ratio for risk of subsequent events among those with established coronary heart disease. The extent of selection bias was determined by the magnitudes of genetic and nongenetic effects on the indexing (first) coronary heart disease event. Unless the genetic hazard ratio was unrealistically large (>1.6 per allele) and assuming the sum of all nongenetic hazard ratios was <10, bias was usually <10% (downward toward the null). Despite the low bias, the probability that a confidence interval included the true effect decreased (undercoverage) with increasing sample size because of increasing precision. Importantly, false-positive rates were not affected by selection bias.

Conclusions: In most empirical settings, selection bias is expected to have a limited impact on genetic effect estimates of subsequent event risk. Nevertheless, because of undercoverage increasing with sample size, most confidence intervals will be over precise (not wide enough). When there is no effect modification by history of coronary heart disease, the false-positive rates of association tests will be close to nominal.

Keywords: alleles; confidence intervals; genetic association studies; risk; sample size; selection bias.

MeSH terms

  • False Positive Reactions
  • Genetic Diseases, Inborn / genetics*
  • Humans
  • Models, Genetic*
  • Observer Variation