Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 93 (3), 422-38

Genetic Evidence for Recent Population Mixture in India


Genetic Evidence for Recent Population Mixture in India

Priya Moorjani et al. Am J Hum Genet.


Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.


Figure 1
Figure 1
Principal Component Analysis (A) Map showing the sampling locations for Indian groups in our study (except central_mix1_nihali7). (B) Principal component analysis (PCA) of 70 of 73 groups with non-Indians (European Americans [CEU], Georgian, Iranian, Basque, and Han Chinese [CHB]) highlights the “Indian cline,” a gradient of West Eurasian relatedness. Great Andamanese and Siddi are not included because of their evidence of relatively recent admixture with non-Indian groups, and central_mix1_nihali is not included because it includes multiple ethno-linguistic groups under one label. To aid visualization, we represent each group by the average PCA coordinates of all the individuals in it. Footnote a indicates groups from Metspalu et al. and footnote b indicates groups from HGDP.
Figure 2
Figure 2
Dates of Mixture We pool samples based on linguistic affiliation (Indo-Europeans [n = 175] and Dravidians [n = 144]) and run rolloff (with the merged Illumina-Affymetrix data set of 86,213 SNPs) to measure the LD resulting from mixture between ANI and ASI. To obtain weights proportional to the allele frequency differences between ANI and ASI at each SNP (needed to run rolloff), we use SNP loadings obtained from a PCA of Basque and a pool of groups from the linguistic cluster whose admixture is not being dated (e.g., we run PCA with Indo-European and Basque when we are dating Dravidian admixture). The output of rolloff is represented as points and the line shows the exponential fit (y=Aend+c) used for estimating the time in generations (n) since mixture. The nonzero constant c allows for variability in the mixture proportion among the groups we pooled and d is the genetic distance in Morgans. Standard errors are computed via a weighted block jackknife (see Material and Methods).

Similar articles

See all similar articles

Cited by 80 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources