Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 12 (5), e1006059

The Great Migration and African-American Genomic Diversity


The Great Migration and African-American Genomic Diversity

Soheil Baharian et al. PLoS Genet.


We present a comprehensive assessment of genomic diversity in the African-American population by studying three genotyped cohorts comprising 3,726 African-Americans from across the United States that provide a representative description of the population across all US states and socioeconomic status. An estimated 82.1% of ancestors to African-Americans lived in Africa prior to the advent of transatlantic travel, 16.7% in Europe, and 1.2% in the Americas, with increased African ancestry in the southern United States compared to the North and West. Combining demographic models of ancestry and those of relatedness suggests that admixture occurred predominantly in the South prior to the Civil War and that ancestry-biased migration is responsible for regional differences in ancestry. We find that recent migrations also caused a strong increase in genetic relatedness among geographically distant African-Americans. Long-range relatedness among African-Americans and between African-Americans and European-Americans thus track north- and west-bound migration routes followed during the Great Migration of the twentieth century. By contrast, short-range relatedness patterns suggest comparable mobility of ∼15-16km per generation for African-Americans and European-Americans, as estimated using a novel analytical model of isolation-by-distance.

Conflict of interest statement

The authors have declared that no competing interests exist.


Fig 1
Fig 1. Inferred regional ancestry proportions for the HRS and SCCS cohorts: (A) African, (B) European, and (C) Native American ancestries.
(D) Local ancestry assignment along the autosomes for an African-American individual from HRS. (E) Comparison of the African ancestry proportions in the HRS, SCCS, and 23andMe stratified by state. Error bars represent 68% confidence intervals derived using sample bootstrap and, thus, do not account for possible sampling biases. 23andMe proportions are from Ref. [12] and are reported for ease of comparison.
Fig 2
Fig 2. Admixture times and proportions of ancestral populations for SCCS in (A) the model with two pulses of admixture and (B) the model with three pulses of admixture.
Because the model features a continuous time parameter but discrete generation times, a single pulse occurring at a fractional time contributes migrants to the two adjacent discrete generation times. African, European, and Native American ancestries are displayed respectively in blue, red, and yellow. Rectangles show the proportion of each ancestry at each generation. Pie charts represent migrations, with the size of the pie representing the amounts of migrants at a given generation and the sectors representing the proportion of migrants coming from each source population. (C) Distribution of continuous ancestry tract lengths (dots) compared with predictions from the best-fit model (lines) for SCCS. Points in the shaded area are within one standard deviation of the predicted result. Kinks in the distribution are due to the finite length of chromosomes [16]. (D) Inferred time to admixture and African ancestry proportions as functions of birth year in HRS African-Americans. (E) Proportions of African ancestry in African-Americans within the North, South, and West using region of birth, region of residence, and migration status; bootstrap p-values are calculated between disjoint sets of individuals.
Fig 3
Fig 3. Pairwise genetic relatedness across US census regions among (A) African-Americans, (B) European-Americans, and (C) African-Americans and European-Americans.
(D) Census-based prediction for African-Americans (see Materials and Methods). On each map, the line connecting two regions shows the average relatedness between individuals in those regions, and the thickness and opacity of the lines are on a linear scale between the minimum and maximum values shown above the map. Relatedness between regions with fewer than 10,000 possible pairs of individuals is not shown (see Materials and Methods for details). All numbers are in units of cM. (E) Decay of average IBD (shown in logarithmic scale) as a function of distance using IBD segments of length 18cM or longer from HRS (dots), compared to the analytical model (lines).

Similar articles

See all similar articles

Cited by 41 articles

See all "Cited by" articles


    1. Voyages Database. Voyages: The Trans-Atlantic Slave Trade Database; 2009. Available from:
    1. Ruggles S, Alexander JT, Genadek K, Goeken R, Schroeder MB, Sobek M. Integrated Public Use Microdata Series: Version 5.0 [Machine-readable database]. Minneapolis: University of Minnesota; 2010.
    1. Wilkerson I. The warmth of other suns: The epic story of America’s great migration. Vintage; 2010.
    1. Lemann N. The Promised Land: The Great Black Migration and How It Changed America. Vintage; 1992.
    1. Bustamante CD, De La Vega FM, Burchard EG. Genomics for the world. Nature. 2011. July;475:163–165. 10.1038/475163a - DOI - PMC - PubMed

Publication types