Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 1;34(11):2996-3005.
doi: 10.1093/molbev/msx209.

Detecting Long-Term Balancing Selection Using Allele Frequency Correlation

Affiliations
Free PMC article

Detecting Long-Term Balancing Selection Using Allele Frequency Correlation

Katherine M Siewert et al. Mol Biol Evol. .
Free PMC article

Abstract

Balancing selection occurs when multiple alleles are maintained in a population, which can result in their preservation over long evolutionary time periods. A characteristic signature of this long-term balancing selection is an excess number of intermediate frequency polymorphisms near the balanced variant. However, the expected distribution of allele frequencies at these loci has not been extensively detailed, and therefore existing summary statistic methods do not explicitly take it into account. Using simulations, we show that new mutations which arise in close proximity to a site targeted by balancing selection accumulate at frequencies nearly identical to that of the balanced allele. In order to scan the genome for balancing selection, we propose a new summary statistic, β, which detects these clusters of alleles at similar frequencies. Simulation studies show that compared with existing summary statistics, our measure has improved power to detect balancing selection, and is reasonably powered in non-equilibrium demographic models and under a range of recombination and mutation rates. We compute β on 1000 Genomes Project data to identify loci potentially subjected to long-term balancing selection in humans. We report two balanced haplotypes-localized to the genes WFS1 and CADM2-that are strongly linked to association signals for complex traits. Our approach is computationally efficient and applicable to species that lack appropriate outgroup sequences, allowing for well-powered analysis of selection in the wide variety of species for which population data are rapidly being generated.

Keywords: balancing selection; human evolution; selection scans.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Model of allelic class build-up. (1) A new SNP (red star) arises in the population and is subject to balancing selection. (2) It sweeps up to its equilibrium frequency. (3) New SNPs enter the population linked to one of the two balanced alleles and some drift up in frequency. However, unlike in the neutral case, their maximum frequency is that of the balanced allele they are linked to, so variants build-up at this frequency (e.g., blue diamond or brown circle). (4) Recombination decouples SNPs (e.g., purple pentagon) from the balanced site, allowing them to experience further genetic drift.
<sc>Fig</sc>. 2.
Fig. 2.
Simulations demonstrating build-up of alleles at frequencies similar to balanced alleles as compared with selectively neutral counterparts. The blue bars indicate the fraction of SNPs in simulation replicates at specific frequency differences away from a balanced core site. In contrast, the orange bars represent simulation replicates that lack a balanced variant. Instead, the core site is chosen to be a neutral variant within frequency 10% of the equilibrium frequency of variants introduced in the balanced simulations. (A) Folded frequency differences between the core SNP and each other SNP in a 400-bp window surrounding the core site. Recombination is not expected to have occurred in this region since the start of selection (Gao etal. 2015). (B) Frequency differences in 2,000-bp windows, where recombination is expected to have occurred since the start of selection.
<sc>Fig</sc>. 3.
Fig. 3.
Power of methods to detect ancient balancing selection. Power was calculated based on simulation replicates containing only neutral variants (True Negatives) or containing a balanced variant that was introduced (True Positives). Columns correspond to simulations of balanced alleles at equilibrium frequencies 0.25, 0.50, and 0.75. Rows correspond to older and more recent selection, beginning 250,000 and 100,000 generations prior to sampling, respectively.
<sc>Fig</sc>. 4.
Fig. 4.
Signal of balancing selection at CADM2. The signal of selection is located in an intron of CADM2. Top: rs17518584 is the lead GWAS SNP for several cognitive traits and is marked by the brown vertical dashed line. The purple dashed line marks two regulatory variants found on the balanced haplotype. β scores were calculated using a rolling average with windows of size 5 kb, including only SNPs at the same frequency as the core SNP in the average. In addition, we show the allele frequencies of the GWAS and a top-scoring β SNP in each representative population. Bottom: Approximate haplotype spans for each population.
<sc>Fig</sc>. 5.
Fig. 5.
Signal of balancing selection at the WFS1 gene. Top: rs4458523 is the lead GWAS SNP for diabetes, and is marked by the brown vertical dashed line. The purple dashed line marks five regulatory variants found on the balanced haplotype. In addition, we show the allele frequencies of the GWAS and a top-scoring β SNP in each representative population. Bottom: Approximate haplotype spans for each population.

Similar articles

Cited by

References

    1. Agrawal AF, Hartfield M.. 2016. Coalescence with background and balancing selection in systems with bi- and uniparental reproduction: contrasting partial asexuality and selfing. Genetics 2021: 313–326. - PMC - PubMed
    1. Aidoo M, Terlouw DJ, Kolczak MS, McElroy PD, ter Kuile FO, Kariuki S, Nahlen BL, Lal AA, Udhayakumar V.. 2002. Protective effects of the sickle cell gene against malaria morbidity and mortality. The Lancet 3599314: 1311–1312. - PubMed
    1. Andrés AM, Hubisz MJ, Indap A, Torgerson DG, Degenhardt JD, Boyko AR, Gutenkunst RN, White TJ, Green ED, Bustamante CD, et al.2009. Targets of balancing selection in the human genome. Mol Biol Evol. 26:2755–2764. - PMC - PubMed
    1. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, et al.2012. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 229: 1790–1797. - PMC - PubMed
    1. Bubb KL, Bovee D, Buckley D, Haugen E, Kibukawa M, Paddock M, Palmieri A, Subramanian S, Zhou Y, Kaul R, et al.2006. Scan of human genome reveals no new Loci under ancient balancing selection. Genetics 1734: 2165–2177. - PMC - PubMed

LinkOut - more resources