Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances

G Su; O F Christensen; L Janss; M S Lund

doi:10.3168/jds.2014-8210

Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances

J Dairy Sci. 2014 Oct;97(10):6547-59. doi: 10.3168/jds.2014-8210. Epub 2014 Aug 14.

Authors

G Su¹, O F Christensen², L Janss², M S Lund²

Affiliations

¹ Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark. Electronic address: guosheng.su@agrsci.dk.
² Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark.

PMID: 25129495
DOI: 10.3168/jds.2014-8210

Abstract

Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [-log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding -log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were -log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and -log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when using markers of the 50,000-marker (50K) chip. In a population with gradually increasing reference data, the weights can be updated once every 3 yr.

Keywords: genomic relationship matrix; genomic selection; model; reliability.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Bayes Theorem
Body Weight
Breeding
Cattle
Fertility / genetics
Genetic Association Studies / veterinary
Genetic Loci*
Genome
Genomics / methods*
Genotype
Linear Models
Milk / metabolism
Models, Theoretical
Phenotype
Polymorphism, Single Nucleotide
Reproducibility of Results