Genetic structure and diversity of cultivated soybean (Glycine max (L.) Merr.) landraces in China

Theor Appl Genet. 2008 Oct;117(6):857-71. doi: 10.1007/s00122-008-0825-0. Epub 2008 Jun 28.


The Chinese genebank contains 23,587 soybean landraces collected from 29 provinces. In this study, a representative collection of 1,863 landraces were assessed for genetic diversity and genetic differentiation in order to provide useful information for effective management and utilization. A total of 1,160 SSR alleles at 59 SSR loci were detected including 97 unique and 485 low-frequency alleles, which indicated great richness and uniqueness of genetic variation in this core collection. Seven clusters were inferred by STRUCTURE analysis, which is in good agreement with a neighbor-joining tree. The cluster subdivision was also supported by highly significant pairwise Fst values and was generally in accordance with differences in planting area and sowing season. The cluster HSuM, which contains accessions collected from the region between 32.0 and 40.5 degrees N, 105.4 and 122.2 degrees E along the central and downstream parts of the Yellow River, was the most genetically diverse of the seven clusters. This provides the first molecular evidence for the hypotheses that the origin of cultivated soybean is the Yellow River region. A high proportion (95.1%) of pairs of alleles from different loci was in LD in the complete dataset. This was mostly due to overall population structure, since the number of locus pairs in LD was reduced sharply within each of the clusters compared to the complete dataset. This shows that population structure needs to be accounted for in association studies conducted within this collection. The low value of LD within the clusters can be seen as evidence that much of the recombination events in the past have been maintained in soybean, fixed in homozygous self-fertilizing landraces.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Breeding
  • China
  • Cluster Analysis
  • Databases, Genetic
  • Gene Frequency
  • Genetic Markers
  • Genetic Variation
  • Linkage Disequilibrium
  • Phylogeny
  • Quantitative Trait Loci
  • Soybeans / classification
  • Soybeans / genetics*


  • Genetic Markers