One common strategy for detecting disease-associated genetic markers is to compare the genotype distributions between cases and controls, where cases have been diagnosed as having the disease condition. In a study of a complex disease with a heterogeneous etiology, the sampled case group most likely consists of people having different disease subtypes. If we conduct an association test by treating all cases as a single group, we maximize our chance of finding genetic risk factors with a homogeneous effect, regardless of the underlying disease etiology. However, this strategy might diminish the power for detecting risk factors whose effect size varies by disease subtype. We propose a robust statistical procedure to identify genetic risk factors that have either a uniform effect for all disease subtypes or heterogeneous effects across different subtypes, in situations where the subtypes are not predefined but can be characterized roughly by a set of clinical and/or pathologic markers. We demonstrate the advantage of the new procedure through numeric simulation studies and an application to a breast cancer study.
Keywords: Breast cancer; Etiology heterogeneity; Genetic association study; Multiple-comparison adjustment; Tree-based model.
© Published by Oxford University Press 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.