A Bayesian toolkit for genetic association studies

Genet Epidemiol. 2006 Apr;30(3):231-47. doi: 10.1002/gepi.20140.


We present a range of modelling components designed to facilitate Bayesian analysis of genetic-association-study data. A key feature of our approach is the ability to combine different submodels together, almost arbitrarily, for dealing with the complexities of real data. In particular, we propose various techniques for selecting the "best" subset of genetic predictors for a specific phenotype (or set of phenotypes). At the same time, we may control for complex, non-linear relationships between phenotypes and additional (non-genetic) covariates as well as accounting for any residual correlation that exists among multiple phenotypes. Both of these additional modelling components are shown to potentially aid in detecting the underlying genetic signal. We may also account for uncertainty regarding missing genotype data. Indeed, at the heart of our approach is a novel method for reconstructing unobserved haplotypes and/or inferring the values of missing genotypes. This can be deployed independently or, alternatively, it can be fully integrated into arbitrary genotype- or haplotype-based association models such that the missing data and the association model are "estimated" simultaneously. The impact of such simultaneous analysis on inferences drawn from the association model is shown to be potentially significant. Our modelling components are packaged as an "add-on" interface to the widely used WinBUGS software, which allows Markov chain Monte Carlo analysis of a wide range of statistical models. We illustrate their use with a series of increasingly complex analyses conducted on simulated data based on a real pharmacogenetic example.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem*
  • Genetic Techniques*
  • Genotype
  • Haplotypes
  • Markov Chains
  • Models, Genetic
  • Monte Carlo Method
  • Phenotype