Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 100 (3), 788-91

Least Effort and the Origins of Scaling in Human Language

Affiliations

Least Effort and the Origins of Scaling in Human Language

Ramon Ferrer i Cancho et al. Proc Natl Acad Sci U S A.

Abstract

The emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization. These principles seem to be common to all languages. The best known is the so-called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank. The possible origins of this law have been controversial, and its meaningfulness is still an open question. In this article, the early hypothesis of Zipf of a principle of least effort for explaining the law is shown to be sound. Simultaneous minimization in the effort of both hearer and speaker is formalized with a simple optimization process operating on a binary matrix of signal-object associations. Zipf's law is found in the transition between referentially useless systems and indexical reference systems. Our finding strongly suggests that Zipf's law is a hallmark of symbolic reference and not a meaningless feature. The implications for the evolution of language are discussed. We explain how language evolution can take advantage of a communicative phase transition.

Figures

Figure 1
Figure 1
Basic scheme of the evolutionary algorithm used in this article. Starting from a given signal–object matrix A (here n = m = 3), the algorithm performs a change in a small number of bits (specifically, with probability ν, each aij can flip). The cost function Ω is then evaluated, and the new matrix is accepted provided that a lower cost is achieved. Otherwise, we start again with the original matrix. At the beginning, A is set up with a fixed density ρ of ones.
Figure 2
Figure 2
(A) 〈In(S, R)〉, the average mutual information as a function of λ. λ* = 0.41 divides 〈In(S, R)〉 into no-communication and perfect-communication phases. (B) Average (effective) lexicon size, 〈L〉, as a function of λ. An abrupt change is seen for λ ≈ 0.41 in both of them. Averages over 30 replicas: n = m = 150, T = 2nm, and ν = 2/(formula image).
Figure 3
Figure 3
Signal normalized frequency, P(k), versus rank, k, for λ = 0.3 (A), λ = λ* = 0.41 (B), and λ = 0.5 (B and C) (averages over 30 replicas: n = m = 150 and T = 2nm). The dotted lines show the distribution that would be obtained if signals and objects connected after a Poissonian distribution of degrees with the same number of connections of the minimum energy configurations. The distribution in B is consistent with human language (α = 1).

Similar articles

See all similar articles

Cited by 64 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback