Whole-genome sequence diversity and association analysis of 198 soybean accessions in mini-core collections

DNA Res. 2021 Jan 19;28(1):dsaa032. doi: 10.1093/dnares/dsaa032.

Abstract

We performed whole-genome Illumina resequencing of 198 accessions to examine the genetic diversity and facilitate the use of soybean genetic resources and identified 10 million single nucleotide polymorphisms and 2.8 million small indels. Furthermore, PacBio resequencing of 10 accessions was performed, and a total of 2,033 structure variants were identified. Genetic diversity and structure analysis congregated the 198 accessions into three subgroups (Primitive, World, and Japan) and showed the possibility of a long and relatively isolated history of cultivated soybean in Japan. Additionally, the skewed regional distribution of variants in the genome, such as higher structural variations on the R gene clusters in the Japan group, suggested the possibility of selective sweeps during domestication or breeding. A genome-wide association study identified both known and novel causal variants on the genes controlling the flowering period. Novel candidate causal variants were also found on genes related to the seed coat colour by aligning together with Illumina and PacBio reads. The genomic sequences and variants obtained in this study have immense potential to provide information for soybean breeding and genetic studies that may uncover novel alleles or genes involved in agronomically important traits.

Keywords: Glycine max; genome diversity; next-generation sequencing; soybean.

MeSH terms

  • Genetic Variation*
  • Genome, Plant*
  • Genome-Wide Association Study
  • Glycine max / genetics*
  • High-Throughput Nucleotide Sequencing
  • INDEL Mutation
  • Polymorphism, Single Nucleotide
  • Whole Genome Sequencing