Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 21;96(11):4490-4500.
doi: 10.1093/jas/sky316.

Do Stronger Measures of Genomic Connectedness Enhance Prediction Accuracies Across Management Units?

Affiliations
Free PMC article

Do Stronger Measures of Genomic Connectedness Enhance Prediction Accuracies Across Management Units?

Haipeng Yu et al. J Anim Sci. .
Free PMC article

Abstract

Genetic connectedness assesses the extent to which estimated breeding values can be fairly compared across management units. Ranking of individuals across units based on best linear unbiased prediction (BLUP) is reliable when there is a sufficient level of connectedness due to a better disentangling of genetic signal from noise. Connectedness arises from genetic relationships among individuals. Although a recent study showed that genomic relatedness strengthens the estimates of connectedness across management units compared with that of pedigree, the relationship between connectedness measures and prediction accuracies only has been explored to a limited extent. In this study, we examined whether increased measures of connectedness led to higher prediction accuracies evaluated by a cross-validation (CV) based on computer simulations. We applied prediction error variance of the difference, coefficient of determination (CD), and BLUP-type prediction models to data simulated under various scenarios. We found that a greater extent of connectedness enhanced accuracy of whole-genome prediction. The impact of genomics was more marked when large numbers of markers were used to infer connectedness and evaluate prediction accuracy. Connectedness across units increased with the proportion of connecting individuals and this increase was associated with improved accuracy of prediction. The use of genomic information resulted in increased estimates of connectedness and improved prediction accuracies compared with those of pedigree-based models when there were enough markers to capture variation due to QTL signals.

Figures

Figure 1.
Figure 1.
Genomic data simulation parameters. SNPs, QTLs, and h2 represent total single nucleotide polymorphisms, quantitative trait loci, and trait heritability, respectively. Simulations were carried out across 2 different h2 (0.8 and 0.2), 2 different numbers of QTLs (1,015 and 290), and 2 different SNP densities (50,000 and 5,000).
Figure 2.
Figure 2.
Management unit (MU) simulation scenarios. (A) Scenario 1 (least connected design). Individuals within clusters 1 to 5 were assigned to MU1 and clusters 6 to 10 were assigned to MU2. (B) Scenarios 2 to 6 (partially connected to connected). The degree of connectedness was gradually increased by exchanging 10% (Scenario 2), 20% (Scenario 3), 30% (Scenario 4), 40% (Scenario 5), and 50% (Scenario 6) of randomly sampled individuals between MU1 and MU2. Scenario 6 corresponds to the connected design.
Figure 3.
Figure 3.
Average relationship coefficients across management units with 5,000 markers over 2 heritability levels and 2 different numbers of quantitative trait loci. S1 to S6 denotes management unit simulation scenarios 1, 2, 3, 4, 5, and 6, respectively. The magnitude of connectedness level steadily increased from S1 to S6. We compared pedigree-based A, genome-based G, and rescaled genome-based G* relationship kernel matrices.
Figure 4.
Figure 4.
Relationship between connectedness and prediction accuracy. PEVD and PA denote prediction error variance of the differences and prediction accuracy, respectively. PA was defined as the correlation between phenotypes and estimated breeding values cor(g,g^) Connectedness of pedigree-based A genome-based G and rescaled genome-based G* within 6 management units simulation scenarios across 2 heritabilities were compared with their prediction accuracies in each graph. (A) 290 QTLs and 5,000 markers. (B) 290 QTLs and 50,000 markers. (C) 1,015 QTLs and 5,000 markers. (D) 1,015 QTLs and 50,000 markers.
Figure 5.
Figure 5.
Relationship between connectedness and prediction accuracy. CD and PA denote coefficient of determination and prediction accuracy, respectively. PA was defined as the correlation between phenotypes and estimated breeding values cor(g,g^) Connectedness of pedigree-based A genome-based G and rescaled genome-based G* within 6 management units simulation scenarios across 2 heritabilities were compared with their prediction accuracies in each graph. (A) 290 QTLs and 5,000 markers. (B) 290 QTLs and 50,000 markers. (C) 1,015 QTLs and 5,000 markers. (D) 1,015 QTLs and 50,000 markers.

Similar articles

See all similar articles

References

    1. Daetwyler H. D., Pong-Wong R., Villanueva B., and Woolliams J. A. 2010. The impact of genetic architecture on genome-wide evaluation methods. Genetics 185:1021–1031. doi:10.1534/genetics.110.116855. - DOI - PMC - PubMed
    1. Eccleston J., and Hedayat A. 1974. On the theory of connected designs: characterization and optimality. Ann. Stat. 2:1238–1255.
    1. Fernando R., Gianola D., and Grossman M. 1983. Identifying all connected subsets in a two-way classification without interaction. J. Dairy Sci. 66:1399–1402.
    1. Foulley J. L., Bouix J., Goffinet B., and Elsen M. J. 1990. Connectedness in genetic evaluation. In: Gianola D., and Hammond K., editors.Advances in statistical methods for genetic improvement of livestock. Springer Verlag, Heidelberg, Germany: p. 277–308.
    1. Foulley J. L., Hanocq E., and Boichard D. 1992. A criterion for measuring the degree of connectedness in linear models of genetic evaluation. Genet. Sel. Evol. 24:315–330.
Feedback