A Bayesian outlier criterion to detect SNPs under selection in large data sets

PLoS One. 2010 Aug 2;5(8):e11913. doi: 10.1371/journal.pone.0011913.

Abstract

Background: The recent advent of high-throughput SNP genotyping technologies has opened new avenues of research for population genetics. In particular, a growing interest in the identification of footprints of selection, based on genome scans for adaptive differentiation, has emerged.

Methodology/principal findings: The purpose of this study is to develop an efficient model-based approach to perform bayesian exploratory analyses for adaptive differentiation in very large SNP data sets. The basic idea is to start with a very simple model for neutral loci that is easy to implement under a bayesian framework and to identify selected loci as outliers via Posterior Predictive P-values (PPP-values). Applications of this strategy are considered using two different statistical models. The first one was initially interpreted in the context of populations evolving respectively under pure genetic drift from a common ancestral population while the second one relies on populations under migration-drift equilibrium. Robustness and power of the two resulting bayesian model-based approaches to detect SNP under selection are further evaluated through extensive simulations. An application to a cattle data set is also provided.

Conclusions/significance: The procedure described turns out to be much faster than former bayesian approaches and also reasonably efficient especially to detect loci under positive selection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Physiological
  • Animals
  • Bayes Theorem
  • Cattle
  • Databases, Genetic*
  • Gene Deletion
  • Genetic Loci / genetics
  • Genomics / methods*
  • Genotype
  • Likelihood Functions
  • Models, Genetic
  • Polymorphism, Single Nucleotide*
  • Reproducibility of Results
  • Selection, Genetic*