Approximate Genome-Based Kernel Models for Large Data Sets Including Main Effects and Interactions
- PMID: 33193659
- PMCID: PMC7594507
- DOI: 10.3389/fgene.2020.567757
Approximate Genome-Based Kernel Models for Large Data Sets Including Main Effects and Interactions
Abstract
The rapid development of molecular markers and sequencing technologies has made it possible to use genomic prediction (GP) and selection (GS) in animal and plant breeding. However, when the number of observations (n) is large (thousands or millions), computational difficulties when handling these large genomic kernel relationship matrices (inverting and decomposing) increase exponentially. This problem increases when genomic × environment interaction and multi-trait kernels are included in the model. In this research we propose selecting a small number of lines m(m < n) for constructing an approximate kernel of lower rank than the original and thus exponentially decreasing the required computing time. First, we describe the full genomic method for single environment (FGSE) with a covariance matrix (kernel) including all n lines. Second, we select m lines and approximate the original kernel for the single environment model (APSE). Similarly, but including main effects and G × E, we explain a full genomic method with genotype × environment model (FGGE), and including m lines, we approximated the kernel method with G × E (APGE). We applied the proposed method to two different wheat data sets of different sizes (n) using the standard linear kernel Genomic Best Linear Unbiased Predictor (GBLUP) and also using eigen value decomposition. In both data sets, we compared the prediction performance and computing time for FGSE versus APSE; we also compared FGGE versus APGE. Results showed a competitive prediction performance of the approximated methods with a significant reduction in computing time. Genomic prediction accuracy depends on the decay of the eigenvalues (amount of variance information loss) of the original kernel as well as on the size of the selected lines m.
Keywords: approximate kernels; computing time; genomic-enabled prediction; genotype × environment interaction; large data sets.
Copyright © 2020 Cuevas, Montesinos-López, Martini, Pérez-Rodríguez, Lillemo and Crossa.
Figures
Similar articles
-
Bayesian Genomic Prediction with Genotype × Environment Interaction Kernel Models.G3 (Bethesda). 2017 Jan 5;7(1):41-53. doi: 10.1534/g3.116.035584. G3 (Bethesda). 2017. PMID: 27793970 Free PMC article.
-
Genomic-Enabled Prediction in Maize Using Kernel Models with Genotype × Environment Interaction.G3 (Bethesda). 2017 Jun 7;7(6):1995-2014. doi: 10.1534/g3.117.042341. G3 (Bethesda). 2017. PMID: 28455415 Free PMC article.
-
Genomic-Enabled Prediction Kernel Models with Random Intercepts for Multi-environment Trials.G3 (Bethesda). 2018 Mar 28;8(4):1347-1365. doi: 10.1534/g3.117.300454. G3 (Bethesda). 2018. PMID: 29476023 Free PMC article.
-
Genomic Selection in Plant Breeding: Methods, Models, and Perspectives.Trends Plant Sci. 2017 Nov;22(11):961-975. doi: 10.1016/j.tplants.2017.08.011. Epub 2017 Sep 28. Trends Plant Sci. 2017. PMID: 28965742 Review.
-
A guide for kernel generalized regression methods for genomic-enabled prediction.Heredity (Edinb). 2021 Apr;126(4):577-596. doi: 10.1038/s41437-021-00412-1. Epub 2021 Mar 1. Heredity (Edinb). 2021. PMID: 33649571 Free PMC article. Review.
Cited by
-
A simulation-based assessment of the efficiency of QTL mapping under environment and genotype x environment interaction effects.PLoS One. 2023 Nov 30;18(11):e0295245. doi: 10.1371/journal.pone.0295245. eCollection 2023. PLoS One. 2023. PMID: 38033088 Free PMC article.
-
On the equivalence between marker effect models and breeding value models and direct genomic values with the Algorithm for Proven and Young.Genet Sel Evol. 2022 Jul 16;54(1):52. doi: 10.1186/s12711-022-00741-7. Genet Sel Evol. 2022. PMID: 35842585 Free PMC article.
-
Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra.Bioinformatics. 2022 Jun 24;38(Suppl 1):i342-i349. doi: 10.1093/bioinformatics/btac260. Bioinformatics. 2022. PMID: 35758813 Free PMC article.
-
A General-Purpose Machine Learning R Library for Sparse Kernels Methods With an Application for Genome-Based Prediction.Front Genet. 2022 Jun 3;13:887643. doi: 10.3389/fgene.2022.887643. eCollection 2022. Front Genet. 2022. PMID: 35719365 Free PMC article.
-
Outlook for Implementation of Genomics-Based Selection in Public Cotton Breeding Programs.Plants (Basel). 2022 May 29;11(11):1446. doi: 10.3390/plants11111446. Plants (Basel). 2022. PMID: 35684219 Free PMC article.
References
-
- Akdemir D. (2014). Training population selection for (breeding value) prediction. arXiv [Preprint]. arXiv:1401.7953
-
- Burgueño J., de los Campos G., Weigel K., Crossa J. (2012). Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci. 52 707–719. 10.2135/cropsci2011.06.0299 - DOI
-
- Crossa J., de los Campos G., Maccaferri M., Tuberosa R., Burgueño J., Pérez-Rodríguez P. (2016). Extending the marker × environment interaction model for genomic-enabled prediction and genome-wide association analysis in durum wheat. Crop Sci. 56 2193–2209.
-
- Crossa J., Pérez P., de los Campos G., Mahuku G., Dreisigacker S., Magorokosho C. (2011). Genomic selection and prediction in plant breeding. J. Crop Improv. 25 239–261.
LinkOut - more resources
Full Text Sources
