Background: Genomic prediction of breeding values involves a so-called training analysis that predicts the influence of small genomic regions by regression of observed information on marker genotypes for a given population of individuals. Available observations may take the form of individual phenotypes, repeated observations, records on close family members such as progeny, estimated breeding values (EBV) or their deregressed counterparts from genetic evaluations. The literature indicates that researchers are inconsistent in their approach to using EBV or deregressed data, and as to using the appropriate methods for weighting some data sources to account for heterogeneous variance.
Methods: A logical approach to using information for genomic prediction is introduced, which demonstrates the appropriate weights for analyzing observations with heterogeneous variance and explains the need for and the manner in which EBV should have parent average effects removed, be deregressed and weighted.
Results: An appropriate deregression for genomic regression analyses is EBV/r2 where EBV excludes parent information and r2 is the reliability of that EBV. The appropriate weights for deregressed breeding values are neither the reliability nor the prediction error variance, two alternatives that have been used in published studies, but the ratio (1 - h2)/[(c + (1 - r2)/r2)h2] where c > 0 is the fraction of genetic variance not explained by markers.
Conclusions: Phenotypic information on some individuals and deregressed data on others can be combined in genomic analyses using appropriate weighting.