Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 4;221(3):iyac052.
doi: 10.1093/genetics/iyac052.

Recombination, selection, and the evolution of tandem gene arrays

Affiliations

Recombination, selection, and the evolution of tandem gene arrays

Moritz Otto et al. Genetics. .

Abstract

Multigene families-immunity genes or sensory receptors, for instance-are often subject to diversifying selection. Allelic diversity may be favored not only through balancing or frequency-dependent selection at individual loci but also by associating different alleles in multicopy gene families. Using a combination of analytical calculations and simulations, we explored a population genetic model of epistatic selection and unequal recombination, where a trade-off exists between the benefit of allelic diversity and the cost of copy abundance. Starting from the neutral case, where we showed that gene copy number is Gamma distributed at equilibrium, we derived also the mean and shape of the limiting distribution under selection. Considering a more general model, which includes variable population size and population substructure, we explored by simulations mean fitness and some summary statistics of the copy number distribution. We determined the relative effects of selection, recombination, and demographic parameters in maintaining allelic diversity and shaping the mean fitness of a population. One way to control the variance of copy number is by lowering the rate of unequal recombination. Indeed, when encoding recombination by a rate modifier locus, we observe exactly this prediction. Finally, we analyzed the empirical copy number distribution of 3 genes in human and estimated recombination and selection parameters of our model.

Keywords: balancing selection; epistasis; gene families; immune genes; unequal recombination.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
a) Fitness of an individual as a function of x (stacks) and y (bars). Parameters: sx=0.02,sy=0.005,βx=0.95,βy=1.05. Each bar represents one value of y with stacked fitness “layers” for x =1 to x = y. b) Normalized fitness of an individual in the y-only model. Parameters: sx=0.02,sy=0.005,ε=0.05 (black) and its Taylor-approximated version T(y)=1s˜(yy*)2, with s˜0.00047 (red). The vertical line marks y*14.86. c) Illustration of individual genotype unequal recombination. Recombination occurs in an individual with y=7=4+3 gene copies and x=5<4+3 different alleles (colors). The black bullet on each chromosome represents the RRM locus (see text).
Fig. 2.
Fig. 2.
a) Linear fit of σ/EY on ln(r/sx) (for details see text). Note the strong correlation of ln(r/sx) and σ/EY, with a Pearson correlation coefficient of ρ=0.97. The estimated regression line σ/y*=0.046·ln(r/sx)+0.26 is shown in red. b) Convergence of the Gamma shape parameter α=(EY/σ)2 toward the value α = 4, expected under neutrality, when r is increasing or sx is decreasing.
Fig. 3.
Fig. 3.
Copy number distribution of 3 different human genes and their approximations. Black: Copy number distribution under neutrality p˜stat with EY=14.94, 11.85, and 19.85 for PSG3, MUC12, and PRR20A, respectively. Red: Gamma distribution with parameters given in Table 2, resulting in best KS-test P-value. Blue: Equilibrium distribution of the y-only model generated from equation (4) with parameters as in Table 2.
Fig. 4.
Fig. 4.
Scenario (a1)—constant population size. Population statistics at equilibrium: population mean x¯ (a); population mean y¯ (b); x/y¯ ratio (c); population mean fitness (d); total number (e), and effective number of alleles |x|eff (f). Varying parameters: population size Ne and selection coefficient sx. Mutation (μ=0.0005) and recombination rate (r =0.01) are kept fixed. Boxplots based on 500 independent replicates. Box colored in purple indicates a parameter combination (Ne = 2,000, r =0.01, sx=0.02,sy=0.005) shared by scenarios (a), (b), (c), and (d). Horizontal lines in a–c indicate the optimal copy number in the y-only model. Horizontal lines in D indicate optimal fitness.
Fig. 5.
Fig. 5.
Scenario (a2)—constant population size. Population statistics at equilibrium: population mean x¯ (a); population mean y¯ (b); x/y¯ ratio (c); population mean fitness (d); total (e), and effective number |x|eff (f) of alleles. Varying parameters: population size Ne = 1,000, 2,000 and recombination rate (r =0.01 times the factor indicated on the abscissa). Mutation rate (μ=0.0005) and selection strength ((sx,sy)=(0.02,0.005)) are kept fixed. Boxplots based on 500 independent replicates. Box colored in purple indicates the parameter combination (see Fig 4) shared by scenarios (a), (b), (c), and (d). Horizontal lines as explained in Fig. 4.
Fig. 6.
Fig. 6.
Scenario (b)—recovery after a bottleneck. Equilibrium populations with N =2,000 are reduced to Nred=20 for a period of 5, 10, or 20 generations and then restored. During recovery, 6 statistics are traced. a) population mean x¯; (b) population mean y¯; (c) ratio x/y¯; (d) mean fitness ω¯; (e) total number of alleles; and (f) |x|eff. Red, orange, and yellow indicate strong, intermediate, and weak selection. Solid, dashed, and dotted lines indicate bottleneck durations of 5, 10, and 20 generations. Each curve is an average across 200 replicates. Horizontal black lines are equilibria under constant population size.
Fig. 7.
Fig. 7.
Scenario (c)—migration. Two separated and equilibrated subpopulations of size N =1,000 start to exchange migrants at time t =0. Medium strength of selection (sx=0.02,sy=0.005). Migration rate: 2Nm=0.1 (green), 1 (cyan), or 10 (blue) migrants per generation in each direction. (a) population mean x¯; (b) population mean y¯; (c) ratio x/y¯; (d) population mean fitness ω¯; (e) total, and (f) effective number of alleles in the combined super-population. Shown are mean values across 100 replicates. Black lines indicate mean values (across 500 replicates) in panmictic populations of size Ne = 1,000 (lower line) and Ne = 2,000 (upper line).
Fig. 8.
Fig. 8.
Scenario (d)—RRM: recombination rate modification. Populations, which have reached equilibrium without RRM, are carried on for 50,000 generations during which the recombination rate, encoded at a modifier locus, may change under the influence of selection. For all iterations: Ne = 2,000, r =0.01. Left: weak (sx,sy)=(0.01,0.0025); middle: intermediate (0.02,0.005); right: strong selection (0.04,0.01). Shown are trajectories of the recombination rate (in percentage of its original value r =0.01) for 200 replicates each. The mean across all 200 replicates is shown as a black line.

Similar articles

Cited by

References

    1. Bahr A, Wilson AB.. The evolution of MHC diversity: evidence of intralocus gene conversion and recombination in a single-locus system. Gene. 2012;497(1):52–57. - PubMed
    1. Beeson SK, Mickelson JR, McCue ME.. Exploration of fine-scale recombination rate variation in the domestic horse. Genome Res. 2019;29(10):1744–1752. - PMC - PubMed
    1. Brahmachary M, Guilmatre A, Quilez J, Hasson D, Borel C, Warburton P, Sharp AJ.. Digital genotyping of macrosatellites and multicopy genes reveals novel biological functions associated with copy number variation of large tandem repeats. PLoS Genet. 2014;10(6):e1004418. - PMC - PubMed
    1. Chao L. Evolution of sex in RNA viruses. J Theor Biol. 1988;133(1):99–112. - PubMed
    1. de Bakker PIW, McVean G, Sabeti PC, Miretti MM, Green T, Marchini J, Ke X, Monsuur AJ, Whittaker P, Delgado M, et al.A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38(10):1166–1172. - PMC - PubMed

Publication types