Genomic prediction of dichotomous traits with Bayesian logistic models

Theor Appl Genet. 2013 Apr;126(4):1133-43. doi: 10.1007/s00122-013-2041-9. Epub 2013 Feb 6.

Abstract

Bayesian methods are a popular choice for genomic prediction of genotypic values. The methodology is well established for traits with approximately Gaussian phenotypic distribution. However, numerous important traits are of dichotomous nature and the phenotypic counts observed follow a Binomial distribution. The standard Gaussian generalized linear models (GLM) are not statistically valid for this type of data. Therefore, we implemented Binomial GLM with logit link function for the BayesB and Bayesian GBLUP genomic prediction methods. We compared these models with their standard Gaussian counterparts using two experimental data sets from plant breeding, one on female fertility in wheat and one on haploid induction in maize, as well as a simulated data set. With the aid of the simulated data referring to a bi-parental population of doubled haploid lines, we further investigated the influence of training set size (N), number of independent Bernoulli trials for trait evaluation (n i ) and genetic architecture of the trait on genomic prediction accuracies and abilities in general and on the relative performance of our models. For BayesB, we in addition implemented finite mixture Binomial GLM to account for overdispersion. We found that prediction accuracies increased with increasing N and n i . For the simulated and experimental data sets, we found Binomial GLM to be superior to Gaussian models for small n i , but that for large n i Gaussian models might be used as ad hoc approximations. We further show with simulated and real data sets that accounting for overdispersion in Binomial data can markedly increase the prediction accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Binomial Distribution
  • Breeding / methods*
  • Computer Simulation
  • Fertility / genetics
  • Genetic Markers / genetics
  • Genome, Plant / genetics*
  • Genotype
  • Linear Models
  • Models, Genetic
  • Phenotype*
  • Triticum / genetics*
  • Triticum / physiology

Substances

  • Genetic Markers