Parallel evolution of genes and languages in the Caucasus region

Mol Biol Evol. 2011 Oct;28(10):2905-20. doi: 10.1093/molbev/msr126. Epub 2011 May 13.


We analyzed 40 single nucleotide polymorphism and 19 short tandem repeat Y-chromosomal markers in a large sample of 1,525 indigenous individuals from 14 populations in the Caucasus and 254 additional individuals representing potential source populations. We also employed a lexicostatistical approach to reconstruct the history of the languages of the North Caucasian family spoken by the Caucasus populations. We found a different major haplogroup to be prevalent in each of four sets of populations that occupy distinct geographic regions and belong to different linguistic branches. The haplogroup frequencies correlated with geography and, even more strongly, with language. Within haplogroups, a number of haplotype clusters were shown to be specific to individual populations and languages. The data suggested a direct origin of Caucasus male lineages from the Near East, followed by high levels of isolation, differentiation, and genetic drift in situ. Comparison of genetic and linguistic reconstructions covering the last few millennia showed striking correspondences between the topology and dates of the respective gene and language trees and with documented historical events. Overall, in the Caucasus region, unmatched levels of gene-language coevolution occurred within geographically isolated populations, probably due to its mountainous terrain.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian Continental Ancestry Group / genetics
  • Chromosomes, Human, Y
  • European Continental Ancestry Group / genetics*
  • Evolution, Molecular*
  • Gene Pool
  • Genetics, Population
  • Haplotypes
  • Humans
  • Language*
  • Linguistics
  • Male
  • Microsatellite Repeats
  • Phylogeny*
  • Polymorphism, Single Nucleotide
  • Russia
  • Sequence Analysis, DNA