Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 74 (6), 1111-20

Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene


Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene

Todd Bersaglieri et al. Am J Hum Genet.


In most human populations, the ability to digest lactose contained in milk usually disappears in childhood, but in European-derived populations, lactase activity frequently persists into adulthood (Scrimshaw and Murray 1988). It has been suggested (Cavalli-Sforza 1973; Hollox et al. 2001; Enattah et al. 2002; Poulter et al. 2003) that a selective advantage based on additional nutrition from dairy explains these genetically determined population differences (Simoons 1970; Kretchmer 1971; Scrimshaw and Murray 1988; Enattah et al. 2002), but formal population-genetics-based evidence of selection has not yet been provided. To assess the population-genetics evidence for selection, we typed 101 single-nucleotide polymorphisms covering 3.2 Mb around the lactase gene. In northern European-derived populations, two alleles that are tightly associated with lactase persistence (Enattah et al. 2002) uniquely mark a common (~77%) haplotype that extends largely undisrupted for >1 Mb. We provide two new lines of genetic evidence that this long, common haplotype arose rapidly due to recent selection: (1) by use of the traditional F(ST) measure and a novel test based on p(excess), we demonstrate large frequency differences among populations for the persistence-associated markers and for flanking markers throughout the haplotype, and (2) we show that the haplotype is unusually long, given its high frequency--a hallmark of recent selection. We estimate that strong selection occurred within the past 5,000-10,000 years, consistent with an advantage to lactase persistence in the setting of dairy farming; the signals of selection we observe are among the strongest yet seen for any gene in the genome.


Figure  1
Figure 1
Elevation in (a) FST and (b) pexcess at multiple SNPs in a 3.2-Mb region around the LCT gene. Position in kb relative to the start of transcription of LCT is on the X-axis. The 90th, 99th, and 99.9th percentiles for FST and pexcess are indicated by dashed lines and are based on 28,440 and 13,696 markers, respectively, throughout the genome (see the “Subjects and Methods” section).
Figure  2
Figure 2
Long-range extended homozygosity for the core haplotype containing the persistence-associated alleles at LCT at various distances from LCT. The extent to which the common core haplotypes remains intact is shown for each chromosome in cM. The core region containing −13910C/T is shown as a black bar, and the LCT gene is oriented from left to right. Core haplotypes containing the persistence-associated allele (−13910T) are shown in red, and those containing the non-persistence–associated allele (−13910C) are shown in blue. Haplotypes are from European-derived U.S. pedigrees; all chromosomes with core haplotypes having a frequency ⩾5% in this population are depicted.
Figure  3
Figure 3
REHH, a measure of extended haplotype homozygosity, plotted for the persistence-associated haplotype at LCT, in comparison with REHH from haplotypes in 10,000 sets of simulated data (Sabeti et al. 2002). Data are shown using markers (a) 5′ and (b) 3′ to the core region. Data for the LCT-persistence-associated haplotype are indicated by red symbols, and data from simulations are indicated by gray symbols. REHH distributions from actual genotypes for 12 control regions were consistent with the simulated distributions (data not shown).

Similar articles

See all similar articles

Cited by 330 articles

See all "Cited by" articles

Publication types

LinkOut - more resources