Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 17;8(1):12309.
doi: 10.1038/s41598-018-30089-2.

Predictive Ability of Genome-Assisted Statistical Models Under Various Forms of Gene Action

Affiliations
Free PMC article

Predictive Ability of Genome-Assisted Statistical Models Under Various Forms of Gene Action

Mehdi Momen et al. Sci Rep. .
Free PMC article

Abstract

Recent work has suggested that the performance of prediction models for complex traits may depend on the architecture of the target traits. Here we compared several prediction models with respect to their ability of predicting phenotypes under various statistical architectures of gene action: (1) purely additive, (2) additive and dominance, (3) additive, dominance, and two-locus epistasis, and (4) purely epistatic settings. Simulation and a real chicken dataset were used. Fourteen prediction models were compared: BayesA, BayesB, BayesC, Bayesian LASSO, Bayesian ridge regression, elastic net, genomic best linear unbiased prediction, a Gaussian process, LASSO, random forests, reproducing kernel Hilbert spaces regression, ridge regression (best linear unbiased prediction), relevance vector machines, and support vector machines. When the trait was under additive gene action, the parametric prediction models outperformed non-parametric ones. Conversely, when the trait was under epistatic gene action, the non-parametric prediction models provided more accurate predictions. Thus, prediction models must be selected according to the most probably underlying architecture of traits. In the chicken dataset examined, most models had similar prediction performance. Our results corroborate the view that there is no universally best prediction models, and that the development of robust prediction models is an important research objective.

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Overall mean (standard error) of predictive and empirical accuracy of different prediction models under various gene action scenarios: purely additive (Ad), additive and dominance (Ad:Dom), additive dominance and epistasis (Ad:Dom:Epi), and pure epistasis (Epi).
Figure 2
Figure 2
Predictive and empirical accuracies of genomic prediction models for traits simulated under purely additive (Ad), additive:dominance (Add:Dom), additive:dominance:epistatic (Add:Dom:Epi), and purely epistatic (Epi) gene action scenarios with a broad sense heritability of 0.30, 0.40, 0.80 and 0.30, respectively. Prediction models: BayesA, BayesB BayesC, Bayesian least absolute shrinkage and selector operator (BL), Bayesian ridge regression (BRR), elastic net (EN), genomic best linear unbiased prediction (GBLUP), Gaussian process (GP), least absolute shrinkage and selector operator (LASSO), random forest (RF), reproducing kernel Hilbert spaces regression (RKHS), ridge regression best linear unbiased prediction (rrBLUP), relevance vector machine (RVM), and support vector machine (SVM).
Figure 3
Figure 3
Boxplots of bias (regression coefficient of simulated phenotypes on genomic estimated breeding values) for traits simulated under purely additive (Ad), additive:dominance (Ad:Dom), additive:dominance:epistatic (Ad:Dom:Epi) and pure epistatic (Epi) gene action scenarios and heritability of 0.30, 0.40, 0.80 and 0.30, respectively. Prediction models: BayesA, BayesB, BayesC, Bayesian least absolute shrinkage and selector operator (BL), Bayesian ridge regression (BRR), elastic net (EN), genomic best linear unbiased prediction (GBLUP), Gaussian process (GP), least absolute shrinkage and selector operator (LASSO), random forest (RF), reproducing kernel Hilbert spaces regression (RKHS), ridge regression best linear unbiased prediction (rrBLUP), relevance vector machine (RVM), and support vector machine (SVM). Outliers are denoted as black dots.
Figure 4
Figure 4
Ward’s hierarchical clustering on predicted genomic values derived from traits simulated under purely additive (Ad), additive:dominance (Ad:Dom), additive:dominance:epistatic (Ad:Dom:Epi) and purely epistatic (Epi) gene action. Prediction models: Bayes A, Bayes B, Bayes C, Bayesian least absolute shrinkage and selector operator (BL), Bayesian ridge regression (BRR), elastic net (EN), genomic best linear unbiased prediction (GBLUP), Gaussian processor (GP), least absolute shrinkage and selector operator (LASSO), random forest (RF), reproducing kernel Hilbert spaces regression (RKHS), ridge regression best linear unbiased prediction (rrBLUP), relevance vector machine (RVM) and support vector machine (SVM).
Figure 5
Figure 5
Boxplots of bias (regression coefficient of observed phenotypes on genomic estimated breeding values) obtained in the testing sets from a 20-fold cross validation using chicken data for body weight (BW), breast meat (BM) and hen-house production (HHP). Prediction models: Bayes A, Bayes B, Bayes C, Bayesian least absolute shrinkage and selector operator (BL), Bayesian ridge regression (BRR), elastic net (EN), genomic best linear unbiased prediction (GBLUP), Gaussian process (GP), least absolute shrinkage and selector operator (LASSO), random forest (RF), reproducing kernel Hilbert spaces regression (RKHS), ridge regression best linear unbiased prediction (rrBLUP), relevance vector machine (RVM) and support vector machine (SVM). Outliers are denoted as black dots.

Similar articles

See all similar articles

Cited by 5 articles

References

    1. Desta ZA, Ortiz R. Genomic selection: genome-wide prediction in plant improvement. Trends in plant science. 2014;19:592–601. doi: 10.1016/j.tplants.2014.05.006. - DOI - PubMed
    1. Ober U, et al. Using whole-genome sequence data to predict quantitative trait phenotypes in Drosophila melanogaster. PLoS genetics. 2012;8:e1002685. doi: 10.1371/journal.pgen.1002685. - DOI - PMC - PubMed
    1. Hayes B, Goddard M. Genome-wide association and genomic selection in animal breeding. Genome/National Research Council Canada = Genome/Conseil national de recherches Canada. 2010;53:876–883. doi: 10.1139/G10-076. - DOI - PubMed
    1. Daetwyler HD, Calus MPL, Pong-Wong R, de los Campos G, Hickey JM. Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking. Genetics. 2013;193:347–365. doi: 10.1534/genetics.112.147983. - DOI - PMC - PubMed
    1. Campos G, et al. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics. 2009;182:375–385. doi: 10.1534/genetics.109.101501. - DOI - PMC - PubMed

Publication types

Feedback