Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers

Theor Appl Genet. 2006 Apr;112(6):1104-14. doi: 10.1007/s00122-006-0212-7. Epub 2006 Feb 2.


Population-based methods for the genetic mapping of adaptive traits and the analysis of natural selection require that the population structure and demographic history of a species are taken into account. We characterized geographic patterns of genetic variation in the model plant Arabidopsis thaliana by genotyping 115 genome-wide single nucleotide polymorphism (SNP) markers in 351 accessions from the whole species range using a matrix-assisted laser desorption/ionization time-of-flight assay, and by sequencing of nine unlinked short genomic regions in a subset of 64 accessions. The observed frequency distribution of SNPs is not consistent with a constant-size neutral model of sequence polymorphism due to an excess of rare polymorphisms. There is evidence for a significant population structure as indicated by differences in genetic diversity between geographic regions. Accessions from Central Asia have a low level of polymorphism and an increased level of genome-wide linkage disequilibrium (LD) relative to accessions from the Iberian Peninsula and Central Europe. Cluster analysis with the structure program grouped Eurasian accessions into K = 6 clusters. Accessions from the Iberian Peninsula and from Central Asia constitute distinct populations, whereas Central and Eastern European accessions represent admixed populations in which genomes were reshuffled by historical recombination events. These patterns likely result from a rapid postglacial recolonization of Eurasia from glacial refugial populations. Our analyses suggest that mapping populations for association or LD mapping should be chosen from regional rather than a species-wide sample or identified genetically as sets of individuals with similar average genetic distances.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • DNA, Plant / genetics
  • Genetic Markers*
  • Genetics, Population
  • Genome, Plant*
  • Genotype
  • Geography
  • Linkage Disequilibrium
  • Polymorphism, Single Nucleotide*


  • DNA, Plant
  • Genetic Markers