An information-gain approach to detecting three-way epistatic interactions in genetic association studies

J Am Med Inform Assoc. Jul-Aug 2013;20(4):630-6. doi: 10.1136/amiajnl-2012-001525. Epub 2013 Feb 8.


Background: Epistasis has been historically used to describe the phenomenon that the effect of a given gene on a phenotype can be dependent on one or more other genes, and is an essential element for understanding the association between genetic and phenotypic variations. Quantifying epistasis of orders higher than two is very challenging due to both the computational complexity of enumerating all possible combinations in genome-wide data and the lack of efficient and effective methodologies.

Objectives: In this study, we propose a fast, non-parametric, and model-free measure for three-way epistasis.

Methods: Such a measure is based on information gain, and is able to separate all lower order effects from pure three-way epistasis.

Results: Our method was verified on synthetic data and applied to real data from a candidate-gene study of tuberculosis in a West African population. In the tuberculosis data, we found a statistically significant pure three-way epistatic interaction effect that was stronger than any lower-order associations.

Conclusion: Our study provides a methodological basis for detecting and characterizing high-order gene-gene interactions in genetic association studies.

Keywords: epistasis; gene-gene interaction; genetic association studies; high-order interaction; information gain.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Epistasis, Genetic*
  • Genetic Association Studies*
  • Genetic Predisposition to Disease*
  • Humans
  • Information Theory
  • Statistics, Nonparametric