Whole-Exome Sequencing Reveals Uncaptured Variation and Distinct Ancestry in the Southern African Population of Botswana

Am J Hum Genet. 2018 May 3;102(5):731-743. doi: 10.1016/j.ajhg.2018.03.010. Epub 2018 Apr 26.

Abstract

Large-scale, population-based genomic studies have provided a context for modern medical genetics. Among such studies, however, African populations have remained relatively underrepresented. The breadth of genetic diversity across the African continent argues for an exploration of local genomic context to facilitate burgeoning disease mapping studies in Africa. We sought to characterize genetic variation and to assess population substructure within a cohort of HIV-positive children from Botswana-a Southern African country that is regionally underrepresented in genomic databases. Using whole-exome sequencing data from 164 Batswana and comparisons with 150 similarly sequenced HIV-positive Ugandan children, we found that 13%-25% of variation observed among Batswana was not captured by public databases. Uncaptured variants were significantly enriched (p = 2.2 × 10-16) for coding variants with minor allele frequencies between 1% and 5% and included predicted-damaging non-synonymous variants. Among variants found in public databases, corresponding allele frequencies varied widely, with Botswana having significantly higher allele frequencies among rare (<1%) pathogenic and damaging variants. Batswana clustered with other Southern African populations, but distinctly from 1000 Genomes African populations, and had limited evidence for admixture with extra-continental ancestries. We also observed a surprising lack of genetic substructure in Botswana, despite multiple tribal ethnicities and language groups, alongside a higher degree of relatedness than purported founder populations from the 1000 Genomes project. Our observations reveal a complex, but distinct, ancestral history and genomic architecture among Batswana and suggest that disease mapping within similar Southern African populations will require a deeper repository of genetic variation and allelic dependencies than presently exists.

Keywords: AIDS; Africa; HIV; genetic mapping; genomics; population genetics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • African Continental Ancestry Group / genetics*
  • Botswana
  • Cohort Studies
  • Gene Pool
  • Genetic Variation*
  • Genetics, Population
  • Genome, Human
  • Geography
  • Humans
  • Phylogeny
  • Principal Component Analysis
  • Whole Exome Sequencing*