Covariate selection for association screening in multiphenotype genetic studies

Nat Genet. 2017 Dec;49(12):1789-1795. doi: 10.1038/ng.3975. Epub 2017 Oct 16.


Testing for associations in big data faces the problem of multiple comparisons, wherein true signals are difficult to detect on the background of all associations queried. This difficulty is particularly salient in human genetic association studies, in which phenotypic variation is often driven by numerous variants of small effect. The current strategy to improve power to identify these weak associations consists of applying standard marginal statistical approaches and increasing study sample sizes. Although successful, this approach does not leverage the environmental and genetic factors shared among the multiple phenotypes collected in contemporary cohorts. Here we developed covariates for multiphenotype studies (CMS), an approach that improves power when correlated phenotypes are measured on the same samples. Our analyses of real and simulated data provide direct evidence that correlated phenotypes can be used to achieve increases in power to levels often surpassing the power gained by a twofold increase in sample size.

MeSH terms

  • Algorithms
  • Genetic Association Studies / methods*
  • Genetic Variation*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Humans
  • Models, Genetic
  • Multivariate Analysis*
  • Phenotype
  • Reproducibility of Results
  • Sample Size