Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 171 (1), 59-71.e21

Reconstructing Prehistoric African Population Structure


Reconstructing Prehistoric African Population Structure

Pontus Skoglund et al. Cell.


We assembled genome-wide data from 16 prehistoric Africans. We show that the anciently divergent lineage that comprises the primary ancestry of the southern African San had a wider distribution in the past, contributing approximately two-thirds of the ancestry of Malawi hunter-gatherers ∼8,100-2,500 years ago and approximately one-third of the ancestry of Tanzanian hunter-gatherers ∼1,400 years ago. We document how the spread of farmers from western Africa involved complete replacement of local hunter-gatherers in some regions, and we track the spread of herders by showing that the population of a ∼3,100-year-old pastoralist from Tanzania contributed ancestry to people from northeastern to southern Africa, including a ∼1,200-year-old southern African pastoralist. The deepest diversifications of African lineages were complex, involving either repeated gene flow among geographically disparate groups or a lineage more deeply diverging than that of the San contributing more to some western African populations than to others. We finally leverage ancient genomes to document episodes of natural selection in southern African populations. PAPERCLIP.

Keywords: Africa; adaptation; ancient DNA; hunter-gatherers; natural selection; population genetics; population history.


Figure 1
Figure 1. Overview of ancient genomes and African population structure
a) Map of sampling locations in Africa and principal component analysis of all individuals. Present-day individuals are indicated with gray circles. b) Automated clustering of key ancient- and present-day populations (for K = 7 cluster components). Present-day populations are labeled in gray.
Figure 2
Figure 2. Ancestral components in eastern and southern Africa
We show bar plots with the proportions inferred for the best model for each target population. We used a model that inferred the ancestry of each target population as 1-source, 2-source, or 3-source mixture of a set of potential source populations. In (a) we show an example of the inferred model for South_Africa_1200BP, an early pastoralist. A filled circle symbol in each panel indicates the geographic location of the sample that we use as a representative of the source population. We show five sources: b) South_Africa_2000BP representing forager populations in southern Africa and a component of prehistoric Malawi and Tanzania that is no longer extant; c) Ethiopia_4500BP which is today found in the Hadza but in the past was characteristic of eastern African hunter-gatherers; d) the Mende from Sierra Leone which is related deeply to the western African ancestry that was spread with the Bantu expansion of agriculturalists; e) the Savanna Pastoral Neolithic sample Tanzania_Luxmanda_3100BP which provides a missing link of the pastoralist population that brought ancestry most closely related to the ancient Levant to southern Africa, and which is also closely but not exclusively related to present-day Cushitic speakers; and f) ancestry more closely related to the Iran Neolithic than what is found in Tanzania_Luxmanda_3100BP, and which may have entered the Horn of Africa in later migrations.
Figure 3
Figure 3. Mixture events in the deeper population history of continental African lineages
A) Maximum likelihood tree of genome sequences from present-day and ancient populations, excluding populations with evidence of asymmetrical allele sharing with non-Africans indicative of recent gene flow (Table S5). Nodes with bootstrap support > 95% are indicated with a circle. B) A symmetry test of the hypothesis that ancient southern Africans are an outgroup lineage to other African populations, which can be rejected for most pairs. C) Asymmetry between western African Mende and Yoruba in the 1000 Genomes Project data is maximized in the Yoruba’s excess affinity to eastern Africans and non-Africans, but highly significant also for groups as distant as southern Africans. D) Admixture Graph solution where Mende from Sierra Leone and Yoruba from Nigeria have ancestry from a basal western African lineage. The other source of western African ancestry is most closely related to eastern Africans and non-Africans (Fig. S5D), which could be consistent with an expansion from eastern Africa. Note that the exact proportion ‘West Africa A’ ancestry is not well constrained by the model, but the difference between Yoruba and Mende is highly significant (panel C). E) Admixture graph solution where the Yoruba have gene flow from a population related to both southern and eastern Africa, which could be consistent with a more complex pattern of isolation-by-distance in the continent.
Figure 4
Figure 4. Ancient genomes provide evidence of natural selection in present-day southern African San populations
We computed branch-specific allele frequency differentiation in 6 present-day high-coverage San genomes compared to a pool of two ~2000 BP South African genomes as an outgroup, using two approaches. A) We computed the statistic in windows of 500 kb separated by 10 kb. We also estimated genome-wide average and standard deviation of the statistic using windows separated by at least 5 Mb, and transformed the genome wide distribution of the sliding windows to be approximately normal (right panel). We observe outliers 15 standard deviations from the mean in a taste receptor gene cluster on chromosome 12, and a secondary peak in the Keratin Associated Protein 4 gene cluster. The outgroup used was 4 Central African Mbuti genomes. See Table 2 for details on all major outlying regions. B) Illustration of the branch-specific allele frequency differentiation approach. C) We computed the statistic and block jackknife standard errors for 208 gene ontology categories with at least 50 genes each (y-axis). The outgroup used was western Africans. As a control to confirm that outlier categories do not show larger magnitudes of allele frequency differentiation across populations, we replaced the present-day San with the central African Mbuti (x-axis).

Similar articles

See all similar articles

Cited by 41 PubMed Central articles

See all "Cited by" articles

LinkOut - more resources