Clustering of 770,000 genomes reveals post-colonial population structure of North America

Nat Commun. 2017 Feb 7;8:14238. doi: 10.1038/ncomms14238.

Abstract

Despite strides in characterizing human history from genetic polymorphism data, progress in identifying genetic signatures of recent demography has been limited. Here we identify very recent fine-scale population structure in North America from a network of over 500 million genetic (identity-by-descent, IBD) connections among 770,000 genotyped individuals of US origin. We detect densely connected clusters within the network and annotate these clusters using a database of over 20 million genealogical records. Recent population patterns captured by IBD clustering include immigrants such as Scandinavians and French Canadians; groups with continental admixture such as Puerto Ricans; settlers such as the Amish and Appalachians who experienced geographic or cultural isolation; and broad historical trends, including reduced north-south gene flow. Our results yield a detailed historical portrait of North America after European settlement and support substantial genetic heterogeneity in the United States beyond that uncovered by previous studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Demography / methods
  • Demography / statistics & numerical data*
  • Emigrants and Immigrants
  • Gene Flow / genetics
  • Genetics, Population / methods*
  • Genotyping Techniques
  • Haplotypes / genetics
  • Humans
  • Polymorphism, Single Nucleotide
  • Population / genetics*
  • Population Dynamics / statistics & numerical data
  • Population Dynamics / trends*
  • Sequence Analysis, DNA
  • United States / ethnology