Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 100 (1), 171-6

Natural Selection Shaped Regional mtDNA Variation in Humans


Natural Selection Shaped Regional mtDNA Variation in Humans

Dan Mishmar et al. Proc Natl Acad Sci U S A.


Human mtDNA shows striking regional variation, traditionally attributed to genetic drift. However, it is not easy to account for the fact that only two mtDNA lineages (M and N) left Africa to colonize Eurasia and that lineages A, C, D, and G show a 5-fold enrichment from central Asia to Siberia. As an alternative to drift, natural selection might have enriched for certain mtDNA lineages as people migrated north into colder climates. To test this hypothesis we analyzed 104 complete mtDNA sequences from all global regions and lineages. African mtDNA variation did not significantly deviate from the standard neutral model, but European, Asian, and Siberian plus Native American variations did. Analysis of amino acid substitution mutations (nonsynonymous, Ka) versus neutral mutations (synonymous, Ks) (kaks) for all 13 mtDNA protein-coding genes revealed that the ATP6 gene had the highest amino acid sequence variation of any human mtDNA gene, even though ATP6 is one of the more conserved mtDNA proteins. Comparison of the kaks ratios for each mtDNA gene from the tropical, temperate, and arctic zones revealed that ATP6 was highly variable in the mtDNAs from the arctic zone, cytochrome b was particularly variable in the temperate zone, and cytochrome oxidase I was notably more variable in the tropics. Moreover, multiple amino acid changes found in ATP6, cytochrome b, and cytochrome oxidase I appeared to be functionally significant. From these analyses we conclude that selection may have played a role in shaping human regional mtDNA variation and that one of the selective influences was climate.


Figure 1
Figure 1
Consensus neighbor-joining tree of 104 human mtDNA complete sequences. Numbers correspond to bootstrap values (percentage of 500 total bootstrap replicates). Because this is a consensus tree, based on bootstrapping, the branch length is not proportional to the mutation numbers. Diagonal lines are drawn in the chimp lineage to denote the much greater genetic distance between human and chimp than among the various human mtDNAs. Maximum likelihood and unweighted pair group method with arithmetic mean methods yielded the same branching orders with respect to the geographically delimited mtDNA haplogroups. Sequences are: I1–53, GenBank accession nos. AF346963–AF347015, numbered according to figure 2 in Ingman et al. (8), starting from the top of that figure; e21u, GenBank accession no. X93334; a1l1a, GenBank accession no. D38112; cam revise, GenBank accession no. NC_001807 corrected according to ref. ; the rest are 48 sequences generated by us by using Applied Biosystems 377. Colors correspond to the continental origin of the individuals chosen for this analysis: yellow, Africans; purple, European; pink, Asians and Native Americans. Specific mutations in patient samples that have been implicated in disease were excluded from this analysis, as were gaps and deletions, with the exception of the 9-bp deletion (nucleotide positions 8272–8280). Haplogroup names are designated with capital letters. Pan paniscus and Pan troglodytes mtDNA sequences were used as outgroups. Haplogroups L0 and L1 replace the previously defined haplogroups L1a and L1b, respectively (35).
Figure 2
Figure 2
Distribution of the relative selective constraints [ka/(ks + constant)] of the 13 human mtDNA polypeptide genes calculated from the 104 complete human mtDNA sequences (16). For each gene, the bottom and top of the line indicates the minimum and maximum values, respectively. The bottom, intermediate, and top horizontal lines in the boxes represent the 25th, 50th (median), and 75th percentile values, respectively. The dot indicates the mean. We have also calculated ka/(ks + ka), dropping those values that were 0/0. This calculation gave essentially the same results (Fig. 5, which is published as supporting information on the PNAS web site).
Figure 3
Figure 3
Distribution of the relative selective constrains [ka/(ks + constant)] calculated for the human mtDNA lineages associated with different climatic zones: tropical and subtropical (African), temperate (European), and arctic and subarctic (Siberian and Native American). Calculation of ka/(ks + constant) and distribution of values are as presented in Fig. 2. Numbers above plots represent P values (Wilcoxon rank-sum test) for the comparison of the distribution of ka/(ks + constant) values for tropical (L0–L3) to temperate (H, V, U, J, T, I, X, N1b, and W) or arctic (A, C, D, G, X, Y, and Z) zones. Very similar distributions and P values were obtained for the arctic whether or not haplogroup B mtDNAs were included in the calculation. Similar results have been obtained by calculating ka/(ks + ka) where significant differences (P ≤ 0.01) were found between tropical (Africans) and arctic (Siberians and Native Americans) for the ND1, ND3, ND5, ND6, COI, COIII, ATP6, and ATP8 genes and between tropical and temperate (Europeans) for the ND1, ND2, ND5, ND6, cytb, COI, COII, and ATP8 genes (Table 2, which is published as supporting information on the PNAS web site). To control for the possibility that the observed differences in the distribution of ka/(ks + constant) ratios were simply an artifact of pairwise calculations, we also compared the raw number of nsyn and syn mutations for each lineage. Using ATP6 as an example, the nsyn/syn ratio for the tropics was 3/15 (0.20), temperate 5/6 (0.83), and arctic 7/5 (1.4). By two-tailed Fisher's exact test, the tropical to arctic ratios were significantly different (P ≤ 0.05). To determine the importance of the distribution of nsyn and syn mutations along individual mtDNA lineages on ka/(ks + constant), we used paml (36) to chart the locations of nsyn and syn variants for ATP6 in the arctic A–D and X and the tropical L0–L3 haplogroups. This process revealed that nsyn and syn mutations were relatively uniformly distributed across the A–D and X lineages, whereas the few African ATP6 variants were located near the ends of the L0 and L3 branches.

Similar articles

See all similar articles

Cited by 304 PubMed Central articles

See all "Cited by" articles

Publication types


Associated data

LinkOut - more resources