State-of-the-art next-generation-sequencing technologies can facilitate in-depth explorations of the human genome by investigating both common and rare variants. For the identification of genetic factors that are associated with disease risk or other complex phenotypes, methods have been proposed for jointly analyzing variants in a set (e.g., all coding SNPs in a gene). Variants in a properly defined set could be associated with risk or phenotype in a concerted fashion, and by accumulating information from them, one can improve power to detect genetic risk factors. Many set-based methods in the literature are based on statistics that can be written as the summation of variant statistics. Here, we propose taking the summation of the exponential of variant statistics as the set summary for association testing. From both Bayesian and frequentist perspectives, we provide theoretical justification for taking the sum of the exponential of variant statistics because it is particularly powerful for sparse alternatives-that is, compared with the large number of variants being tested in a set, only relatively few variants are associated with disease risk-a distinctive feature of genetic data. We applied the exponential combination gene-based test to a sequencing study in anticancer pharmacogenomics and uncovered mechanistic insights into genes and pathways related to chemotherapeutic susceptibility for an important class of oncologic drugs.
Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.