Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 25 (1), 1-9

Detecting Regular Sound Changes in Linguistics as Events of Concerted Evolution


Detecting Regular Sound Changes in Linguistics as Events of Concerted Evolution

Daniel J Hruschka et al. Curr Biol.


Background: Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family.

Results: Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular sound change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations.

Conclusions: We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.


Figure 1
Figure 1
Posterior Distributions of Log-Likelihoods from the Sporadic and Regular Change Models Mean log-likelihoods: sporadic change model (purple) = −32,303.9 ± 15; regular change model (green) = −29,196.2 ± 15. Δlog-likelihood = 3,107, based on an average of 74.27 ± 0.47 parameters describing regular sound changes. Deviance information criterion test, ΔDIC = 3,739; values > 0 support regular model. Note: x axis is broken.
Figure 2
Figure 2
Phylogenetic Trees of the Turkic Language Family Consensus topologies for the model allowing only sporadic changes (A) and the model allowing regular sound changes (B). Regular sound changes are indicated along the top and bottom of branches of the topology: events in black show directional changes from the beginning to ending phoneme; events shown in purple indicate two phonemes that have replaced each other. The model additionally estimates the position of each regular sound change along the branch. Mean estimated age of root between Chuvash and other Turkic languages: sporadic model (A) = 2408 BCE, with 95% credible intervals of 3993–1279 BCE; regular model (B) = 204 BCE, with 95% credible intervals of 605 BCE–81 CE. The posterior date of the calibration node (red dot; [18, 19]) is 1017 ± 20 CE.
Figure 3
Figure 3
Performance of the Model of Regular Change in Predicting Sound Changes Colored (nongray) cells correspond to instances of regular sound change as proposed by linguists [17, 20] (see text and Table S2), ranging from ≥10× improvement by the regular change model (dark red) to cases in which the sporadic model outperformed the regular model (blue). Gray cells correspond to cases in which the ancestral phoneme has been retained (no phonological change has occurred). Geometric mean improvement across all colored cells (probability of regular model/probability of sporadic model) = 1.87 ± 2.98, range = 0.14 to 150.12; n = 371. Geometric mean improvement excluding cases of partial ancestral retention (white cells) = 3.71 ± 5.14, range = 0.14 to 150.12; n = 179. Leftmost columns: LA = ancestral phoneme derived from linguists’ proposals; MA = model-derived ancestral phoneme.
Figure 4
Figure 4
Sound Transition Networks Showing Regular and Sporadic Changes Transitions among consonants (circles) and among vowels (squares) are frequent and regular (many connections) but are rare between them, save for those mediated by the semivowel w. Transitions are more frequent among sounds with similar places of articulation: consonants are coded as bilabials-labiodentals (red), nasal (light green), uvular-velar-glottal (purple), postalveolar-palatals (blue), and dental-alveolars (green); vowels divide into high (gray) and higher-mid to low (white) subsets. Blue lines denote sporadic transitions, with thicker lines denoting faster underlying rates. Red lines denote regular changes; arrows indicate the direction of change.
Figure 5
Figure 5
Regular Sound Changes (A) Approximately linear trend in the cumulative frequency of regular sound changes through time, indicating a constant rate of regular sound change of about 0.0026 events per branch per annum; trend is counts of regular change events per unit time in the tree, averaged across the posterior sample of trees. Purple line is the mean trend; yellow line is 1:1 trend. (B) Expected Poisson (gray) and observed (purple) number of regular sound changes per branch. Expected values generated from a Poisson distribution with mean 0.0026 × t were calculated for each branch of the tree, where t is the length of the branch in years (generalized linear model test of deviation from Poisson expectation not significant: χ2 = 16.95, df = 14, p > 0.26). (C) Cumulative waiting times until the next regular sound change event (purple) and best-fit exponential distribution (gray). Exponential mean = 303 years; 95% confidence interval includes 385 years or 1/0.00262. The exponential provided the best fit when compared against gamma, Weibull, and log-normal cumulative densities. (D) Expected range (max-min) of regular sound changes occurring in the histories of the 26 Turkic languages. Data were generated from 10,000 simulations of the Poisson expectation in each of the branches of the tree in Figure 2. Yellow triangle shows observed range (15–1); yellow square shows range adjusting for unique phonemes in Chuvash (see text).

Comment in

Similar articles

See all similar articles

Cited by 9 articles

See all "Cited by" articles


    1. Liao D. Concerted evolution: molecular mechanism and biological implications. Am. J. Hum. Genet. 1999;64:24–30. - PMC - PubMed
    1. Ohta T. Gene conversion and evolution of gene families: an overview. Genes (Basel) 2010;1:349–356. - PMC - PubMed
    1. Flot J.-F., Hespeels B., Li X., Noel B., Arkhipova I., Danchin E.G., Hejnol A., Henrissat B., Koszul R., Aury J.-M. Genomic evidence for ameiotic evolution in the bdelloid rotifer Adineta vaga. Nature. 2013;500:453–457. - PubMed
    1. Rozen S., Skaletsky H., Marszalek J.D., Minx P.J., Cordum H.S., Waterston R.H., Wilson R.K., Page D.C. Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature. 2003;423:873–876. - PubMed
    1. Skaletsky H., Kuroda-Kawaguchi T., Minx P.J., Cordum H.S., Hillier L., Brown L.G., Repping S., Pyntikova T., Ali J., Bieri T. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature. 2003;423:825–837. - PubMed

Publication types

LinkOut - more resources