Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages

Science. 2023 Jul 28;381(6656):eabg0818. doi: 10.1126/science.abg0818. Epub 2023 Jul 28.

Abstract

The origins of the Indo-European language family are hotly disputed. Bayesian phylogenetic analyses of core vocabulary have produced conflicting results, with some supporting a farming expansion out of Anatolia ~9000 years before present (yr B.P.), while others support a spread with horse-based pastoralism out of the Pontic-Caspian Steppe ~6000 yr B.P. Here we present an extensive database of Indo-European core vocabulary that eliminates past inconsistencies in cognate coding. Ancestry-enabled phylogenetic analysis of this dataset indicates that few ancient languages are direct ancestors of modern clades and produces a root age of ~8120 yr B.P. for the family. Although this date is not consistent with the Steppe hypothesis, it does not rule out an initial homeland south of the Caucasus, with a subsequent branch northward onto the steppe and then across Europe. We reconcile this hybrid hypothesis with recently published ancient DNA evidence from the steppe and the northern Fertile Crescent.

MeSH terms

  • Bayes Theorem
  • Europe
  • Farms
  • Language* / history
  • Phylogeny