GPDTI: a Genetic Programming Decision Tree induction method to find epistatic effects in common complex diseases

Bioinformatics. 2007 Jul 1;23(13):i167-74. doi: 10.1093/bioinformatics/btm205.

Abstract

Motivation: The identification of risk-associated genetic variants in common diseases remains a challenge to the biomedical research community. It has been suggested that common statistical approaches that exclusively measure main effects are often unable to detect interactions between some of these variants. Detecting and interpreting interactions is a challenging open problem from the statistical and computational perspectives. Methods in computing science may improve our understanding on the mechanisms of genetic disease by detecting interactions even in the presence of very low heritabilities.

Results: We have implemented a method using Genetic Programming that is able to induce a Decision Tree to detect interactions in genetic variants. This method has a cross-validation strategy for estimating classification and prediction errors and tests for consistencies in the results. To have better estimates, a new consistency measure that takes into account interactions and can be used in a genetic programming environment is proposed. This method detected five different interaction models with heritabilities as low as 0.008 and with prediction errors similar to the generated errors.

Availability: Information on the generated data sets and executable code is available upon request.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Chromosome Mapping / methods*
  • Decision Support Techniques*
  • Epistasis, Genetic*
  • Genetic Predisposition to Disease / genetics*
  • Genetic Testing / methods
  • Genetics, Population*
  • Humans
  • Models, Genetic*
  • Penetrance