Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov 6;456(7218):98-101.
doi: 10.1038/nature07331. Epub 2008 Aug 31.

Genes Mirror Geography Within Europe

Free PMC article

Genes Mirror Geography Within Europe

John Novembre et al. Nature. .
Free PMC article

Erratum in

  • Nature. 2008 Nov 13;456(7219):274


Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing; an individual's DNA can be used to infer their geographic origin with surprising accuracy-often to within a few hundred kilometres.


Figure 1
Figure 1. Population structure within Europe
a, A statistical summary of genetic data from 1,387 Europeans based on principal component axis one (PC1) and axis two (PC2). Small coloured labels represent individuals and large coloured points represent median PC1 and PC2 values for each country. The inset map provides a key to the labels. The PC axes are rotated to emphasize the similarity to the geographic map of Europe. AL, Albania; AT, Austria; BA, Bosnia-Herzegovina; BE, Belgium; BG, Bulgaria; CH, Switzerland; CY, Cyprus; CZ, Czech Republic; DE, Germany; DK, Denmark; ES, Spain; FI, Finland; FR, France; GB, United Kingdom; GR, Greece; HR, Croatia; HU, Hungary; IE, Ireland; IT, Italy; KS, Kosovo; LV, Latvia; MK, Macedonia; NO, Norway; NL, Netherlands; PL, Poland; PT, Portugal; RO, Romania; RS, Serbia and Montenegro; RU, Russia, Sct, Scotland; SE, Sweden; SI, Slovenia; SK, Slovakia; TR, Turkey; UA, Ukraine; YG, Yugoslavia. b, A magnification of the area around Switzerland from a showing differentiation within Switzerland by language. c, Genetic similarity versus geographic distance. Median genetic correlation between pairs of individuals as a function of geographic distance between their respective populations.
Figure 2
Figure 2. Performance of assignment method
a, Predicted locations for each of 1,387 individuals based on leave-one-out cross validation and the continuous assignment method. Small coloured labels (for definitions, see Fig. 1 legend, except here CH-I, CH-F, and CH-G denote Swiss individuals who speak Italian, French, or German respectively) represent individual assignments. Coloured points denote the locations used to train the assignment method. b, Distribution of prediction accuracy by country. Distances are measured between the population assigned by the discrete assignment method and the geographic origin of the individual. The average is taken of the proportions across populations and each population is given equal weight. The panel shows results for populations with greater than six individuals; performance decreases for populations with smaller sample sizes (Supplementary Fig. 3).

Comment in

Similar articles

See all similar articles

Cited by 482 articles

See all "Cited by" articles

Publication types