A fast and powerful tree-based association test for detecting complex joint effects in case-control studies

Bioinformatics. 2014 Aug 1;30(15):2171-8. doi: 10.1093/bioinformatics/btu186. Epub 2014 Apr 9.

Abstract

Motivation: Multivariate tests derived from the logistic regression model are widely used to assess the joint effect of multiple predictors on a disease outcome in case-control studies. These tests become less optimal if the joint effect cannot be approximated adequately by the additive model. The tree-structure model is an attractive alternative, as it is more apt to capture non-additive effects. However, the tree model is used most commonly for prediction and seldom for hypothesis testing, mainly because of the computational burden associated with the resampling-based procedure required for estimating the significance level.

Results: We designed a fast algorithm for building the tree-structure model and proposed a robust TREe-based Association Test (TREAT) that incorporates an adaptive model selection procedure to identify the optimal tree model representing the joint effect. We applied TREAT as a multilocus association test on >20 000 genes/regions in a study of esophageal squamous cell carcinoma (ESCC) and detected a highly significant novel association between the gene CDKN2B and ESCC ([Formula: see text]). We also demonstrated, through simulation studies, the power advantage of TREAT over other commonly used tests.

Availability and implementation: The package TREAT is freely available for download at http://www.hanzhang.name/softwares/treat, implemented in C++ and R and supported on 64-bit Linux and 64-bit MS Windows.

Contact: yuka@mail.nih.gov

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Carcinoma, Squamous Cell / genetics
  • Case-Control Studies
  • Computational Biology / methods*
  • Cyclin-Dependent Kinase Inhibitor p15 / genetics
  • Decision Trees*
  • Esophageal Neoplasms / genetics
  • Genetic Predisposition to Disease / genetics
  • Genotype
  • Humans
  • Logistic Models
  • Polymorphism, Single Nucleotide
  • Time Factors

Substances

  • Cyclin-Dependent Kinase Inhibitor p15