Development of an alfalfa SNP array and its use to evaluate patterns of population structure and linkage disequilibrium

PLoS One. 2014 Jan 9;9(1):e84329. doi: 10.1371/journal.pone.0084329. eCollection 2014.


A large set of genome-wide markers and a high-throughput genotyping platform can facilitate the genetic dissection of complex traits and accelerate molecular breeding applications. Previously, we identified about 0.9 million SNP markers by sequencing transcriptomes of 27 diverse alfalfa genotypes. From this SNP set, we developed an Illumina Infinium array containing 9,277 SNPs. Using this array, we genotyped 280 diverse alfalfa genotypes and several genotypes from related species. About 81% (7,476) of the SNPs met the criteria for quality control and showed polymorphisms. The alfalfa SNP array also showed a high level of transferability for several closely related Medicago species. Principal component analysis and model-based clustering showed clear population structure corresponding to subspecies and ploidy levels. Within cultivated tetraploid alfalfa, genotypes from dormant and nondormant cultivars were largely assigned to different clusters; genotypes from semidormant cultivars were split between the groups. The extent of linkage disequilibrium (LD) across all genotypes rapidly decayed to 26 Kbp at r(2) = 0.2, but the rate varied across ploidy levels and subspecies. A high level of consistency in LD was found between and within the two subpopulations of cultivated dormant and nondormant alfalfa suggesting that genome-wide association studies (GWAS) and genomic selection (GS) could be conducted using alfalfa genotypes from throughout the fall dormancy spectrum. However, the relatively low LD levels would require a large number of markers to fully saturate the genome.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Breeding
  • Cluster Analysis
  • Genetic Markers
  • Genetic Variation
  • Genetics, Population
  • Linkage Disequilibrium / genetics*
  • Medicago sativa / genetics*
  • Oligonucleotide Array Sequence Analysis*
  • Phylogeny
  • Plant Dormancy
  • Polymorphism, Single Nucleotide / genetics*
  • Principal Component Analysis
  • Reproducibility of Results
  • Seasons


  • Genetic Markers

Grant support

This research was funded by the USDA-DOE Plant Feedstock Genomics for Bioenergy program, award # 2009-65504-05809 to ECB and MJM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.