A comparison of multivariate genome-wide association methods

PLoS One. 2014 Apr 24;9(4):e95923. doi: 10.1371/journal.pone.0095923. eCollection 2014.

Abstract

Joint association analysis of multiple traits in a genome-wide association study (GWAS), i.e. a multivariate GWAS, offers several advantages over analyzing each trait in a separate GWAS. In this study we directly compared a number of multivariate GWAS methods using simulated data. We focused on six methods that are implemented in the software packages PLINK, SNPTEST, MultiPhen, BIMBAM, PCHAT and TATES, and also compared them to standard univariate GWAS, analysis of the first principal component of the traits, and meta-analysis of univariate results. We simulated data (N = 1000) for three quantitative traits and one bi-allelic quantitative trait locus (QTL), and varied the number of traits associated with the QTL (explained variance 0.1%), minor allele frequency of the QTL, residual correlation between the traits, and the sign of the correlation induced by the QTL relative to the residual correlation. We compared the power of the methods using empirically fixed significance thresholds (α = 0.05). Our results showed that the multivariate methods implemented in PLINK, SNPTEST, MultiPhen and BIMBAM performed best for the majority of the tested scenarios, with a notable increase in power for scenarios with an opposite sign of genetic and residual correlation. All multivariate analyses resulted in a higher power than univariate analyses, even when only one of the traits was associated with the QTL. Hence, use of multivariate GWAS methods can be recommended, even when genetic correlations between traits are weak.

Publication types

  • Comparative Study
  • Meta-Analysis
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Genome-Wide Association Study*
  • Humans
  • Multivariate Analysis
  • Quantitative Trait Loci

Grant support

This work was performed within a PhD project supported by the Nijmegen Centre for Evidence Based Practice, Radboud university medical center, Nijmegen, The Netherlands. This work was sponsored by the Stichting Nationale Computerfaciliteiten (National Computing Facilities Foundation, NCF) for the use of supercomputer facilities, with financial support from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organization for Scientific Research, NWO). The research in Liège, Belgium is funded by the Belgian Science Policy Office Phase VII IAP network “Dynamical systems, control and optimization” (DYSCO II) and the Fonds de la Recherche Scientifique (FNRS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.