Reducing dimensionality for prediction of genome-wide breeding values

Genet Sel Evol. 2009 Mar 18;41(1):29. doi: 10.1186/1297-9686-41-29.


Partial least square regression (PLSR) and principal component regression (PCR) are methods designed for situations where the number of predictors is larger than the number of records. The aim was to compare the accuracy of genome-wide breeding values (EBV) produced using PLSR and PCR with a Bayesian method, 'BayesB'. Marker densities of 1, 2, 4 and 8 Ne markers/Morgan were evaluated when the effective population size (Ne) was 100. The correlation between true breeding value and estimated breeding value increased with density from 0.611 to 0.681 and 0.604 to 0.658 using PLSR and PCR respectively, with an overall advantage to PLSR of 0.016 (s.e = 0.008). Both methods gave a lower accuracy compared to the 'BayesB', for which accuracy increased from 0.690 to 0.860. PLSR and PCR appeared less responsive to increased marker density with the advantage of 'BayesB' increasing by 17% from a marker density of 1 to 8Ne/M. PCR and PLSR showed greater bias than 'BayesB' in predicting breeding values at all densities. Although, the PLSR and PCR were computationally faster and simpler, these advantages do not outweigh the reduction in accuracy, and there is a benefit in obtaining relevant prior information from the distribution of gene effects.

MeSH terms

  • Animals
  • Animals, Domestic / genetics*
  • Breeding*
  • Chromosomes, Mammalian / genetics
  • Computer Simulation
  • Female
  • Genome*
  • Male
  • Models, Genetic