Accuracy of Prediction of Simulated Polygenic Phenotypes and Their Underlying Quantitative Trait Loci Genotypes Using Real or Imputed Whole-Genome Markers in Cattle

Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.


Background: More accurate genomic predictions are expected when the effects of QTL (quantitative trait loci) are predicted from markers in close physical proximity to the QTL. The objective of this study was to quantify to what extent whole-genome methods using 50 K or imputed 770 K SNPs (single nucleotide polymorphisms) could predict single or multiple QTL genotypes based on SNPs in close proximity to those QTL.

Methods: Phenotypes with a heritability of 1 were simulated for 2677 Hereford animals genotyped with the BovineSNP50 BeadChip. Genotypes for the high-density 770 K SNP panel were imputed using Beagle software. Various Bayesian regression methods were used to predict single QTL or a trait influenced by 42 such QTL. We quantified to what extent these predictions were based on SNPs in close proximity to the QTL by comparing whole-genome predictions to local predictions based on estimates of the effects of variable numbers of SNPs i.e. ±1, ±2, ±5, ±10, ±50 or ±100 that flanked the QTL.

Results: Prediction accuracies based on local SNPs using whole-genome training for single QTL with the 50 K SNP panel and BayesC0 ranged from 0.49 (±1 SNP) to 0.75 (±100 SNPs). The minimum number of local SNPs for an accurate prediction is ±10 SNPs. Prediction accuracies that were based on local SNPs only were higher than those based on whole-genome SNPs for both 50 K and 770 K SNP panels. For the 770 K SNP panel, prediction accuracies were higher than 0.70 and varied little i.e. between 0.73 (±1 SNP) and 0.77 (±5 SNPs). For the summed 42 QTL, prediction accuracies were generally higher than for single QTL regardless of the number of local SNPs. For QTL with low minor allele frequency (MAF) compared to QTL with high MAF, prediction accuracies increased as the number of SNPs around the QTL increased.

Conclusions: These results suggest that with both 50 K and imputed 770 K SNP genotypes the level of linkage disequilibrium is sufficient to predict single and multiple QTL. However, prediction accuracies are eroded through spuriously estimated effects of SNPs that are distant from the QTL. Prediction accuracies were higher with the 770 K than with the 50 K SNP panel.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Bayes Theorem
  • Breeding
  • Cattle
  • Gene Frequency
  • Genetic Markers
  • Genome*
  • Genomics / methods*
  • Genotype*
  • Markov Chains
  • Models, Genetic*
  • Models, Statistical
  • Multifactorial Inheritance*
  • Phenotype*
  • Polymorphism, Single Nucleotide
  • Quantitative Trait Loci*
  • Quantitative Trait, Heritable
  • Reproducibility of Results
  • Selection, Genetic


  • Genetic Markers