Local Ancestry Inference in a Large US-Based Hispanic/Latino Study: Hispanic Community Health Study/Study of Latinos (HCHS/SOL)

G3 (Bethesda). 2016 Jun 1;6(6):1525-34. doi: 10.1534/g3.116.028779.

Abstract

We estimated local ancestry on the autosomes and X chromosome in a large US-based study of 12,793 Hispanic/Latino individuals using the RFMix method, and we compared different reference panels and approaches to local ancestry estimation on the X chromosome by means of Mendelian inconsistency rates as a proxy for accuracy. We developed a novel and straightforward approach to performing ancestry-specific PCA after finding artifactual behavior in the results from an existing approach. Using the ancestry-specific PCA, we found significant population structure within African, European, and Amerindian ancestries in the Hispanic/Latino individuals in our study. In the African ancestral component of the admixed individuals, individuals whose grandparents were from Central America clustered separately from individuals whose grandparents were from the Caribbean, and also from reference Yoruba and Mandenka West African individuals. In the European component, individuals whose grandparents were from Puerto Rico diverged partially from other background groups. In the Amerindian ancestral component, individuals clustered into multiple different groups depending on the grandparental country of origin. Therefore, local ancestry estimation provides further insight into the complex genetic structure of US Hispanic/Latino populations, which must be properly accounted for in genotype-phenotype association studies. It also provides a basis for admixture mapping and ancestry-specific allele frequency estimation, which are useful in the identification of risk factors for disease.

Keywords: Hispanic/Latino; local ancestry; principal components analysis.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Alleles
  • Female
  • Gene Frequency
  • Genetic Association Studies
  • Genetics, Population*
  • Genome-Wide Association Study
  • Genotype
  • Hispanic Americans / genetics*
  • Humans
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide
  • Public Health Surveillance*
  • United States / epidemiology
  • United States / ethnology
  • Young Adult