RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference

Am J Hum Genet. 2013 Aug 8;93(2):278-88. doi: 10.1016/j.ajhg.2013.06.020. Epub 2013 Aug 1.


Local-ancestry inference is an important step in the genetic analysis of fully sequenced human genomes. Current methods can only detect continental-level ancestry (i.e., European versus African versus Asian) accurately even when using millions of markers. Here, we present RFMix, a powerful discriminative modeling approach that is faster (~30×) and more accurate than existing methods. We accomplish this by using a conditional random field parameterized by random forests trained on reference panels. RFMix is capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors. RFMix shows high sensitivity and specificity in simulated Hispanics/Latinos and African Americans and admixed Europeans, Africans, and Asians. Finally, we demonstrate that African Americans in HapMap contain modest (but nonzero) levels of Native American ancestry (~0.4%).

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • African Continental Ancestry Group / genetics*
  • Asian Continental Ancestry Group / genetics*
  • Computer Simulation
  • European Continental Ancestry Group / genetics*
  • Genetic Testing
  • Genome, Human*
  • Haplotypes
  • Humans
  • Indians, North American / genetics*
  • Models, Genetic*
  • Polymorphism, Single Nucleotide