Support for linguistic macrofamilies from weighted sequence alignment

Proc Natl Acad Sci U S A. 2015 Oct 13;112(41):12752-7. doi: 10.1073/pnas.1500331112. Epub 2015 Sep 24.

Abstract

Computational phylogenetics is in the process of revolutionizing historical linguistics. Recent applications have shed new light on controversial issues, such as the location and time depth of language families and the dynamics of their spread. So far, these approaches have been limited to single-language families because they rely on a large body of expert cognacy judgments or grammatical classifications, which is currently unavailable for most language families. The present study pursues a different approach. Starting from raw phonetic transcription of core vocabulary items from very diverse languages, it applies weighted string alignment to track both phonetic and lexical change. Applied to a collection of ∼1,000 Eurasian languages and dialects, this method, combined with phylogenetic inference, leads to a classification in excellent agreement with established findings of historical linguistics. Furthermore, it provides strong statistical support for several putative macrofamilies contested in current historical linguistics. In particular, there is a solid signal for the Nostratic/Eurasiatic macrofamily.

Keywords: cultural evolution; historical linguistics; linguistic macrofamilies; mass lexical comparison; phylogenetic methods.

Publication types

  • Research Support, Non-U.S. Gov't