Genetic and linguistic histories in Central Asia inferred using approximate Bayesian computations

Proc Biol Sci. 2017 Aug 30;284(1861):20170706. doi: 10.1098/rspb.2017.0706.


Linguistic and genetic data have been widely compared, but the histories underlying these descriptions are rarely jointly inferred. We developed a unique methodological framework for analysing jointly language diversity and genetic polymorphism data, to infer the past history of separation, exchange and admixture events among human populations. This method relies on approximate Bayesian computations that enable the identification of the most probable historical scenario underlying each type of data, and to infer the parameters of these scenarios. For this purpose, we developed a new computer program PopLingSim that simulates the evolution of linguistic diversity, which we coupled with an existing coalescent-based genetic simulation program, to simulate both linguistic and genetic data within a set of populations. Applying this new program to a wide linguistic and genetic dataset of Central Asia, we found several differences between linguistic and genetic histories. In particular, we showed how genetic and linguistic exchanges differed in the past in this area: some cultural exchanges were maintained without genetic exchanges. The methodological framework and the linguistic simulation tool developed here can be used in future work for disentangling complex linguistic and genetic evolutions underlying human biological and cultural histories.

Keywords: approximate Bayesian computation; cultural evolution; gene–language coevolution; population genetics.

MeSH terms

  • Asia
  • Bayes Theorem
  • Genetic Variation
  • Genetics, Population*
  • Humans
  • Language*
  • Software