Determining population stratification and subgroup effects in association studies of rare genetic variants for nicotine dependence

Psychiatr Genet. 2019 Aug;29(4):111-119. doi: 10.1097/YPG.0000000000000227.


Background: Rare variants (minor allele frequency < 1% or 5 %) can help researchers to deal with the confounding issue of 'missing heritability' and have a proven role in dissecting the etiology for human diseases and complex traits.

Methods: We extended the combined multivariate and collapsing (CMC) and weighted sum statistic (WSS) methods and accounted for the effects of population stratification and subgroup effects using stratified analyses by the principal component analysis, named here as 'str-CMC' and 'str-WSS'. To evaluate the validity of the extended methods, we analyzed the Genetic Architecture of Smoking and Smoking Cessation database, which includes African Americans and European Americans genotyped on Illumina Human Omni2.5, and we compared the results with those obtained with the sequence kernel association test (SKAT) and its modification, SKAT-O that included population stratification and subgroup effect as covariates. We utilized the Cochran-Mantel-Haenszel test to check for possible differences in single nucleotide polymorphism allele frequency between subgroups within a gene. We aimed to detect rare variants and considered population stratification and subgroup effects in the genomic region containing 39 acetylcholine receptor-related genes.

Results: The Cochran-Mantel-Haenszel test as applied to GABRG2 (P = 0.001) was significant. However, GABRG2 was detected both by str-CMC (P= 8.04E-06) and str-WSS (P= 0.046) in African Americans but not by SKAT or SKAT-O.

Conclusions: Our results imply that if associated rare variants are only specific to a subgroup, a stratified analysis might be a better approach than a combined analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genetic Predisposition to Disease*
  • Genetics, Population*
  • Genome-Wide Association Study*
  • Humans
  • Multivariate Analysis
  • Mutation / genetics*
  • Principal Component Analysis
  • Tobacco Use Disorder / genetics*