Revisiting a GWAS peak in Arabidopsis thaliana reveals possible confounding by genetic heterogeneity

Heredity (Edinb). 2021 Sep;127(3):245-252. doi: 10.1038/s41437-021-00456-3. Epub 2021 Jul 5.


Genome-wide association studies (GWAS) have become a standard approach for exploring the genetic basis of phenotypic variation. However, correlation is not causation, and only a tiny fraction of all associations have been experimentally confirmed. One practical problem is that a peak of association does not always pinpoint a causal gene, but may instead be tagging multiple causal variants. In this study, we reanalyze a previously reported peak associated with flowering time traits in Swedish Arabidopsis thaliana population. The peak appeared to pinpoint the AOP2/AOP3 cluster of glucosinolate biosynthesis genes, which is known to be responsible for natural variation in herbivore resistance. Here we propose an alternative hypothesis, by demonstrating that the AOP2/AOP3 flowering association can be wholly accounted for by allelic variation in two flanking genes with clear roles in regulating flowering: NDX1, a regulator of the main flowering time controller FLC, and GA1, which plays a central role in gibberellin synthesis and is required for flowering under some conditions. In other words, we propose that the AOP2/AOP3 flowering-time association may be yet another example of a spurious, "synthetic" association, arising from trying to fit a single-locus model in the presence of two statistically associated causative loci. We conclude that caution is needed when using GWAS for fine-mapping.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis Proteins* / genetics
  • Arabidopsis* / genetics
  • Flowers / genetics
  • Genetic Heterogeneity
  • Genome-Wide Association Study


  • Arabidopsis Proteins