Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 9;9(7):2253-2265.
doi: 10.1534/g3.118.200917.

Genomic Selection for Yield and Seed Composition Traits Within an Applied Soybean Breeding Program

Affiliations

Genomic Selection for Yield and Seed Composition Traits Within an Applied Soybean Breeding Program

Benjamin B Stewart-Brown et al. G3 (Bethesda). .

Abstract

Genomic selection (GS) has become viable for selection of quantitative traits for which marker-assisted selection has often proven less effective. The potential of GS for soybean was characterized using 483 elite breeding lines, genotyped with BARCSoySNP6K iSelect BeadChips. Cross validation was performed using RR-BLUP and predictive abilities (rMP) of 0.81, 0.71, and 0.26 for protein, oil, and yield, were achieved at the largest tested training set size. Minimal differences were observed when comparing different marker densities and there appeared to be inflation in rMP due to population structure. For comparison purposes, two additional methods to predict breeding values for lines of four bi-parental populations within the GS dataset were tested. The first method predicted within each bi-parental population (WP method) and utilized a training set of full-sibs of the validation set. The second method utilized a training set of all remaining breeding lines except for full-sibs of the validation set to predict across populations (AP method). The AP method is more practical as the WP method would likely delay the breeding cycle and leverage smaller training sets. Averaging across populations for protein and oil content, rMP for the AP method (0.55, 0.30) approached rMP for the WP method (0.60, 0.52). Though comparable, rMP for yield was low for both AP and WP methods (0.12, 0.13). Based on increases in rMP as training sets increased and the effectiveness of WP vs. AP method, the AP method could potentially improve with larger training sets and increased relatedness between training and validation sets.

Keywords: GenPred; Genomic Prediction; Genomic selection; RR-BLUP; Seed composition; Seed yield; Shared Data Resources; Soybean.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Diagram displaying the three methods performed for estimating predictive ability within the genomic selection dataset. (A) Perform cross-validation using the entire mixed population as both the validation set and training set (EGSD method), (B) Perform cross-validation within bi-parental populations using Pop1-4 individually as the validation set and training set (WP method); and (C) Predict across populations using one of Pop1-4 as the validation set and the remaining breeding lines as the training set (AP method).
Figure 2
Figure 2
Principle component analysis of genomic selection dataset.
Figure 3
Figure 3
Boxplots of the effect of training set size (NP) on predictive ability (rMP) for each trait when utilizing the entire genomic selection dataset (EGSD) method. Solid line represents median and dotted line represents mean.
Figure 4
Figure 4
Boxplots of the effect of marker density (NM) on predictive ability (rMP) for each trait when utilizing the entire genomic selection dataset (EGSD) method. Number of markers indicated in parentheses. Solid line represents median and dotted line represents average.
Figure 5
Figure 5
Graph displaying the effect of training set size (NP) on predictive ability (rMP) for each trait when contrasting the within population (WP) method vs. the across population (AP) method. rMP was averaged across the four validation sets (Pop1-4). The WP method was indicated with a horizontal dashed line while the AP method was indicated with a solid trend line across TS sizes. For the WP method, a single training set size of 50 breeding lines was used.
Figure 6
Figure 6
Effects of population structure on prediction of oil content when utilizing the entire genomic selection dataset (EGSD) method. (A) PCA of genomic prediction population using all SNPs. (B) PCA of genomic prediction population using 8th tag SNPs. (C) Average predicted GEBV vs. observed BLUP values when using all SNPs. (D) Average predicted GEBV vs. observed BLUP values when using 8th tag SNPs. (E) Average predicted GEBV vs. observed BLUP within Pop1-4 when using all SNPs. (F) Average predicted GEBV vs. observed BLUP within Pop1-4 when using 8th tag SNPs. Correlation coefficients presented within scatterplots (C-F).

Similar articles

Cited by

References

    1. Albrecht T., Wimmer V., Auinger H. J., Erbe M., Knaak C., et al. , 2011. Genome-based prediction of testcross values in maize. Theor. Appl. Genet. 123: 339–350. 10.1007/s00122-011-1587-7 - DOI - PubMed
    1. American Soybean Association, 2018 2016 Soy Highlights. The American Soybean Association. http://soystats.com/ 2013-highlights/ (accessed 20 Jan. 2018).
    1. Bernardo R., Yu J. M., 2007. Prospects for genome-wide selection for quantitative traits in maize. Crop Sci. 47: 1082–1090. 10.2135/cropsci2006.11.0690 - DOI
    1. Barrett J. C., Fry B., Maller J., Daly M. J., 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265. 10.1093/bioinformatics/bth457 - DOI - PubMed
    1. Bates D., Maechler M., Bolker B., Walker S., 2015. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67: 1–48. 10.18637/jss.v067.i01 - DOI

Publication types

LinkOut - more resources