Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 10 (9), e0135820

Genetic Heritage of the Balto-Slavic Speaking Populations: A Synthesis of Autosomal, Mitochondrial and Y-Chromosomal Data


Genetic Heritage of the Balto-Slavic Speaking Populations: A Synthesis of Autosomal, Mitochondrial and Y-Chromosomal Data

Alena Kushniarevich et al. PLoS One.


The Slavic branch of the Balto-Slavic sub-family of Indo-European languages underwent rapid divergence as a result of the spatial expansion of its speakers from Central-East Europe, in early medieval times. This expansion-mainly to East Europe and the northern Balkans-resulted in the incorporation of genetic components from numerous autochthonous populations into the Slavic gene pools. Here, we characterize genetic variation in all extant ethnic groups speaking Balto-Slavic languages by analyzing mitochondrial DNA (n = 6,876), Y-chromosomes (n = 6,079) and genome-wide SNP profiles (n = 296), within the context of other European populations. We also reassess the phylogeny of Slavic languages within the Balto-Slavic branch of Indo-European. We find that genetic distances among Balto-Slavic populations, based on autosomal and Y-chromosomal loci, show a high correlation (0.9) both with each other and with geography, but a slightly lower correlation (0.7) with mitochondrial DNA and linguistic affiliation. The data suggest that genetic diversity of the present-day Slavs was predominantly shaped in situ, and we detect two different substrata: 'central-east European' for West and East Slavs, and 'south-east European' for South Slavs. A pattern of distribution of segments identical by descent between groups of East-West and South Slavs suggests shared ancestry or a modest gene flow between those two groups, which might derive from the historic spread of Slavic people.

Conflict of interest statement

Competing Interests: The authors' have read the journal's policy and the authors of this manuscript have the following competing interests: Co-author Toomas Kivisild is a PLOS ONE Academic Editor. Additionally, Lejla Mulahasanovic is employed by Center for Genomics and Transcriptomics (CeGaT GmbH). There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.


Fig 1
Fig 1. The Balto-Slavic populations analyzed in this study and the tree of Balto-Slavic languages.
The map (lower panel) shows the geographical distribution of Balto-Slavic populations (colored areas) within Europe. The symbols on the map represent the geographic location of the populations genotyped. The map was created in the GeneGeo software as described previously [68,75]. A manually constructed consensus phylogenetic tree of the Balto-Slavic languages (upper panel) is based on the StarlingNJ, NJ, BioNJ, UPGMA, Bayesian MCMC, Unweighted Maximum Parsimony methods. Ternary nodes resulting from neighboring binary nodes were joined together if the temporal distance between them was ≤ 300 years. StarlingNJ dates are proposed (S2 File).
Fig 2
Fig 2. Genetic structure of the Balto-Slavic populations within a European context according to the three genetic systems.
a) PC1vsPC3 plot based on autosomal SNPs (PC1 = 0.53; PC3 = 0.26); b) MDS based on NRY data (stress = 0.13); c) MDS based on mtDNA data (stress = 0.20). We focus on PC1vsPC3 because PC2 (S1 Fig) whilst differentiating the Volga region populations from the rest of Europeans had a low efficiency in detecting differences among the Balto-Slavic populations–the primary focus of this work.
Fig 3
Fig 3. ADMIXTURE plot (k = 6).
Ancestry proportions of 1,194 individuals as revealed by ADMIXTURE.
Fig 4
Fig 4. Distribution of the average number of IBD segments between groups of East-West Slavs (a), South Slavs (b), and their respective geographic neighbors.
The x-axis indicates ten classes of IBD segment length (in cM); the y-axis indicates the average number of shared IBD segments per pair of individuals within each length class.
Fig 5
Fig 5. Correlations between matrices of genetic, geographic and linguistic distances among Balto-Slavic populations.

Similar articles

See all similar articles

Cited by 20 articles

See all "Cited by" articles


    1. Fortson Benjamin W. IV. Indo-European Language and Culture: An Introduction. Oxford: Blackwell; 2004.
    1. Mallory JP, Adams DQ. The Oxford introduction to Proto-Indo-European and the Proto-Indo-European world. Oxford: Oxford University Press; 2006.
    1. Rexová K, Frynta D, Zrzavý J. Cladistic analysis of languages: Indo-European classification based on lexicostatistical data. Cladistics. 2003;19: 120–127. 10.1111/j.1096-0031.2003.tb00299.x - DOI
    1. Ringe D, Warnow T, Taylor A. Indo-European and Computational Cladistics. Trans Philol Soc. 2002;100: 59–129. 10.1111/1467-968X.00091 - DOI
    1. Nakhleh L, Warnow T, Ringe D, Evans SN. A comparison of phylogenetic reconstruction methods on an Indo-European dataset. Trans Philol Soc. 2005;103: 171–192. 10.1111/j.1467-968X.2005.00149.x - DOI

Publication types