Comparison of three summary statistics for ranking genes in genome-wide association studies

Stat Med. 2014 May 20;33(11):1828-41. doi: 10.1002/sim.6063. Epub 2013 Dec 9.

Abstract

Problems associated with insufficient power have haunted the analysis of genome-wide association studies and are likely to be the main challenge for the analysis of next-generation sequencing data. Ranking genes according to their strength of association with the investigated phenotype is one solution. To obtain rankings for genes, researchers can draw from a wide range of statistics summarizing the relationships between variants mapped to a gene and the phenotype. Hence, it is of interest to explore the performance of these statistics in the context of rankings. To this end, we conducted a simulation study (limited to genes of equal sizes) of three different summary statistics examining the ability to rank genes in a meaningful order. The weighted sum of squared marginal score test (Pan, 2009), RareCover algorithm (Bahtia et al., 2010) and the elastic net regularization (Zou and Hastie, 2005) were chosen, because they can handle common as well as rare variants. The test based on the score statistic outperformed both other methods in almost all investigated scenarios. It was the only measure to consistently detect genes with interacting causal variants. However, the RareCover algorithm proved better at identifying genes including causal variants with small effect sizes and low minor allele frequency than the weighted sum of squared marginal score test. The performance of the elastic net regularization was unimpressive for all but the simplest scenarios.

Keywords: collapsing methods; gene ranking; genome-wide association studies; multiple marker tests; penalization; score test.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Gene Frequency
  • Genetic Variation / genetics*
  • Genome-Wide Association Study / methods*
  • Humans
  • Phenotype*