Kernel-based whole-genome prediction of complex traits: a review

Gota Morota; Daniel Gianola

doi:10.3389/fgene.2014.00363

Kernel-based whole-genome prediction of complex traits: a review

Front Genet. 2014 Oct 16:5:363. doi: 10.3389/fgene.2014.00363. eCollection 2014.

Authors

Gota Morota¹, Daniel Gianola²

Affiliations

¹ Department of Animal Science, University of Nebraska-Lincoln Lincoln, NE, USA.
² Department of Animal Sciences, University of Wisconsin-Madison Madison, WI, USA ; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison Madison, WI, USA ; Department of Dairy Science, University of Wisconsin-Madison Madison, WI, USA.

Abstract

Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

Keywords: SNP; kernel methods; semi-parametric regression; spatial distance; whole-genome prediction.

Publication types

Review