Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 10 (1), 551

Patterns of Genetic Differentiation and the Footprints of Historical Migrations in the Iberian Peninsula

Affiliations

Patterns of Genetic Differentiation and the Footprints of Historical Migrations in the Iberian Peninsula

Clare Bycroft et al. Nat Commun.

Abstract

The Iberian Peninsula is linguistically diverse and has a complex demographic history, including a centuries-long period of Muslim rule. Here, we study the fine-scale genetic structure of its population, and the genetic impacts of historical events, leveraging powerful, haplotype-based statistical methods to analyse 1413 individuals from across Spain. We detect extensive fine-scale population structure at extremely fine scales (below 10 Km) in some regions, including Galicia. We identify a major east-west axis of genetic differentiation, and evidence of historical north to south population movement. We find regionally varying fractions of north-west African ancestry (0-11%) in modern-day Iberians, related to an admixture event involving European-like and north-west African-like source populations. We date this event to 860-1120 CE, implying greater genetic impacts in the early half of Muslim rule in Iberia. Together, our results indicate clear genetic impacts of population movements associated with both the Muslim conquest and the subsequent Reconquista.

Conflict of interest statement

S.M. is a director of GENSCI limited. P.D. is a director and Chief Executive Officer of Genomics plc, and a partner of Peptide Groove LLP. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Spanish individuals grouped into clusters using genetic data only. a Binary tree showing the inferred hierarchical relationships between clusters inferred using genotype data of 1413 individuals (fineSTRUCTURE analysis A). The colours and points correspond to the clusters shown on the map, and the length of the coloured rectangles is proportional to the number of individuals assigned to that cluster. We combined some small clusters (Methods) and the thick black branches indicate the clades of the tree that we visualise in the map. Clusters are labelled according to the approximate location of most of their members, but geographic data was not used in the inference. b Each individual (n = 726) is represented by a point placed at (or close to, <24 Km) the centroid of their grandparents’ birthplaces. We only plot the individuals for whom all four grandparents were born within 80 km of their average birthplace, although the data for all individuals were used in the fineSTRUCTURE inference. The background is coloured according to the spatial densities of each cluster at the level of the tree where there are 14 clusters (Methods). The colour and symbol of each point corresponds to the cluster the individual was assigned to at a lower level of the tree, as shown in a. Spain’s autonomous communities are also shown. c A representation of changes in the linguistic and political boundaries in Iberia from ~930 to 1300 CE, adapted with permission from maps by Baldinger. Different linguistic areas are shown with the colours and shading, and political boundaries with white borders (in the far right map only). Only the colours and labels of the Christian kingdoms have been added to aid visualisation
Fig. 2
Fig. 2
Clustering analysis including Portuguese individuals; and large clusters at the bottom of the tree. a This map and tree show clusters inferred by fineSTRUCTURE (analysis B) that included data from Portuguese individuals but using a smaller set of SNPs (Methods). As in Fig. 1b we show the level of tree such that all clusters contain at least 15 individuals (39 clusters). Points representing 843 individuals are shown on this map but, as with analysis A, data for all Portuguese and Spanish individuals (1530) were used in the inference. Positions of points and background colours are determined using the same procedure as for Fig. 1b (Methods), with the exception of Portugal. No fine-scale geographic information was available for these individuals, so we placed them randomly within the boundaries of Portugal and show a single background colour. b This map shows geographic spread of the three large clusters that remain at the bottom of the tree inferred in the Spain-only fineSTRUCTURE analysis (see main text; Fig. 1a). These clusters each contain more than 100 individuals out of the full set of 1413. The accompanying tree highlights the three clusters within the full tree structure. The width of the coloured rectangles is proportional to the number of individuals belonging to each cluster (yellow = 222; orange = 165; red = 123)
Fig. 3
Fig. 3
Ultra-fine-scale genetic structure within Spain. Points representing individuals are placed on each of the magnified maps and coloured as described in Fig. 1, with short dark lines pointing to their precise locations (the average birthplace of their grandparents). The three magnified maps show local elevation, rivers, and water bodies, as well as borders of autonomous communities (solid black lines) and provinces (dashed lines and text). a Locations of individuals (44) within the genetic clusters centred in Galicia. Note that we show this region at a higher level of the tree (14) as the lower level yields clusters with fewer than three individuals with fine-scale geographic location data. b Locations of individuals (60) within the clusters centred in the Basque-speaking regions of País Vasco (Basque Country) and Navarra. For visual clarity we only show the individuals that are within the clade coloured blue and green in Fig. 1. This clade makes up the majority of all individuals located in this region, and a majority of this clade is located in this region (60 of 64 with geographic data). c Locations of individuals (16) who almost all comprise a single cluster exclusive to a ~50-km-wide region along the banks of the River Ebro in La Rioja, just south of País Vasco and Navarra
Fig. 4
Fig. 4
Estimates of shared ancestry between Spanish individuals and across fineSTRUCTURE clusters. a Matrix of coancestry values used in cluster inference. Each of 1413 individuals is represented as a row, where each element is the coancestry (in cM) shared with each of the other individuals (see Methods for the definition of coancestry). In order to visualise the bulk of the variation, values equal to or above the 90th percentile (7.7 cM) are coloured black. The tree is as shown in Fig. 1a, and the horizontal black lines demarcate the clusters at the lower level of the tree, and labelled with points. b The distribution of the mean coancestry between individuals in the same cluster for 200 bootstrap resamples (Methods). Clusters are ordered by their median value, and coloured/labelled according to those shown in a. One cluster (part of the clade labelled ‘Galicia_central’) was excluded from this analysis as it only contains 9 individuals. c Evidence for excess of coancestry with a source cluster compared to within-cluster coancestry. Each row of this matrix is a cluster inferred in the fineSTRUCTURE analysis as labelled in a. For each recipient cluster (rows) we tested whether the mean coancestry among individuals within the recipient cluster is smaller than their mean coancestry with individuals in each of the other clusters (columns). Each element is coloured according to –log10(p), where p-values are based on 200 bootstrap resamples using the same sample size (13 individuals) for all clusters (Methods). Dark borders indicate source-recipient pairs with a p-value < 0.02 (not Bonferroni corrected). d Illustration of demographic scenarios leading to high coancestry between two different clusters. The symbols α and β represent clusters of individuals today, and α′ and β′ represent their ancestral populations. Arrows represent mixing of one ancestral population into the other at some time (or times) in the past. In the left two scenarios individuals in β will have—on average—higher coancestry with each other than with individuals in α. In the right two scenarios it is possible for individuals in β to have higher coancestry with individuals in α than with each other (see Supplementary Note 5 for a fuller discussion)
Fig. 5
Fig. 5
Characterising genetic contributions to Iberia. a Geographic distribution of 843 Iberian individuals grouped into six clusters based on haplotype sharing with external populations (Methods). More individuals (1530) were used in the inference, but only those with adequate geographic data are shown on the map. Background colours and the positions of points on the map are determined using the same procedure as for Fig. 1b, with the exception of individuals of Portuguese origin. No fine-scale geographic information was available for these individuals, so we placed them randomly within the boundaries of Portugal and show a single background colour (Methods). b Admixture dates and mix of admixing groups in single-date, two-way admixture events, as inferred using GLOBETROTTER (n = 541 individuals). On the left are the donor groups inferred to best represent the two ancestral populations involved in the admixture event (separated by a dashed line), along with the inferred admixture proportions of the smaller side (for donor groups contributing at least 1%). Estimated dates and 95% bootstrap intervals are shown on the right, for each target Iberian cluster as shown in a. The white vertical dashed lines show the time of the initial Muslim conquest (711 CE) and the Siege of Seville (1248 CE), between which around half (or more) of Iberia was under Muslim rule. The admixture dates assume a 28-year generation time, and a current generation date of 1940 (the approximate average birth-year of this cohort). Detailed results of this GLOBETROTTER analysis are tabulated in Supplementary Tables 3a and 4. c, d We estimated ancestry profiles for each point on a fine spatial grid across Spain (Methods). The background colour shows the fraction contributed from a particular donor group, as defined by the scale bar. Grey crosses show the locations of the Iberian individuals used in the estimation: 843 in c, 793 in map d. Map c shows the fraction contributed from the donor group ‘NorthMorocco’. Map d shows the fraction contributed from the donor group ‘Basque1’, which we defined based on the Spain-only fineSTRUCTURE analysis (Fig. 1a). Maps for other donor groups are shown in Supplementary Figure 5
Fig. 6
Fig. 6
Locations of donor groups and ancestry profiles of Iberian clusters. a Locations of individuals (n = 1503) within 30 non-Spanish genetic groups inferred using fineSTRUCTURE (Methods). Each point represents an individual, placed at their country-level location of origin, and coloured according to their inferred genetic group. Individuals from the same location (country) have been randomly jittered for visual clarity. Names are assigned to clusters based on where the majority of the individuals in the clusters are located. Where a cluster was split more evenly across two regions, a double-barrel name is used. All groups shown here, except ‘Portugal’, were used as donor groups in the analyses of Iberia. b Each column shows the ancestry profile for each of the inferred clusters shown in Fig. 5a. The heights of the bars show the proportion of each cluster’s ancestry which is best represented by that of the labelled non-Iberian donor group (Methods). Note that each row has a different y-axis range for visibility of the smaller components. Error bars show the range of the inner 95% of 1000 bootstrap resamples (Methods), and donor groups are only shown if at least one cluster has a range not including zero and a point estimate >0.001. The exact values plotted here and cluster sample sizes are in Supplementary Table 1
Fig. 7
Fig. 7
Variation and timing of Basque-like genetic contributions in Iberia. Fraction contributions from the Basque-like donor group in ancestry profiles (Methods) and Basque-like admixture dates (GLOBETROTTER) for each cluster inferred in the Spain-only analysis (as shown in Fig. 1a) plus Portugal. The clade labelled ‘Galicia_Pontevedra’ in Fig. 1a was combined into one group for this analysis. The admixture dates are for a two-way admixture event involving a Basque-like side and an European-like side, and shown with 95% bootstrap intervals (Methods). The dates shown assume a 28-year generation time, and a current generation date of 1940. Detailed results of this GLOBETROTTER analysis are in Supplementary Table 3b

Similar articles

See all similar articles

Cited by 8 articles

See all "Cited by" articles

References

    1. Novembre J, et al. Genes mirror geography within Europe. Nature. 2008;456:98–101. doi: 10.1038/nature07331. - DOI - PMC - PubMed
    1. Menozzi, P., Piazza, A. & Cavalli-Sforza, L. Synthetic maps of human gene frequencies in Europeans. Science201, 786–792 (1978). - PubMed
    1. Li JZ, et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319:1100–1104. doi: 10.1126/science.1153717. - DOI - PubMed
    1. Ralph P, Coop G. The geography of recent genetic ancestry across Europe. PLoS Biol. 2013;11:e1001555. doi: 10.1371/journal.pbio.1001555. - DOI - PMC - PubMed
    1. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. - PMC - PubMed

Publication types

Feedback