Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 13;24(1):73.
doi: 10.1186/s13059-023-02912-1.

Large-scale genome sequencing redefines the genetic footprints of high-altitude adaptation in Tibetans

Affiliations

Large-scale genome sequencing redefines the genetic footprints of high-altitude adaptation in Tibetans

Wangshan Zheng et al. Genome Biol. .

Abstract

Background: Tibetans are genetically adapted to high-altitude environments. Though many studies have been conducted, the genetic basis of the adaptation remains elusive due to the poor reproducibility for detecting selective signatures in the Tibetan genomes.

Results: Here, we present whole-genome sequencing (WGS) data of 1001 indigenous Tibetans, covering the major populated areas of the Qinghai-Tibetan Plateau in China. We identify 35 million variants, and more than one-third of them are novel variants. Utilizing the large-scale WGS data, we construct a comprehensive map of allele frequency and linkage disequilibrium and provide a population-specific genome reference panel, referred to as 1KTGP. Moreover, with the use of a combined approach, we redefine the signatures of Darwinian-positive selection in the Tibetan genomes, and we characterize a high-confidence list of 4320 variants and 192 genes that have undergone selection in Tibetans. In particular, we discover four new genes, TMEM132C, ATP13A3, SANBR, and KHDRBS2, with strong signals of selection, and they may account for the adaptation of cardio-pulmonary functions in Tibetans. Functional annotation and enrichment analysis indicate that the 192 genes with selective signatures are likely involved in multiple organs and physiological systems, suggesting polygenic and pleiotropic effects.

Conclusions: Overall, the large-scale Tibetan WGS data and the identified adaptive variants/genes can serve as a valuable resource for future genetic and medical studies of high-altitude populations.

Keywords: 1KTGP; Cardio-pulmonary functions; High-altitude adaptation; Pleiotropic effect; Polygenic effect; Positive selection; Tibetan; Whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Geographic locations of the sampled Tibetans and WGS data quality assessment. A The geographical locations of the Tibetan samples in this study. The sampling locations and the sample sizes are indicated. B The quality of the Tibetan 1001 WGS data, reflected by the depth and Q30 values. The mean depth and Q30 are indicated with the red dotted lines. C The minor allele frequency spectrum of all identified SNVs. The known and novel variants are shown in red and blue, respectively. D The genome-wide PCA plot of Tibetans and 18 representative East Asian populations. The red circles are the 1001 samples (Tibetans) from the current study, and the blue circles are the 33 published WGS samples (Tibetans*) [18]
Fig. 2
Fig. 2
The spectrum of genome-wide variant frequency and LD of Tibetans. A Comparison of SNV counts of MAF among the 1001 WGS data and the published data. The 1001 WGS data is much more powerful in detecting rare variants than the published data. B The distribution of HWE deviation for SNVs with large between-population divergences (FST(Tibetan-Han) > 0.1), and the cutoff of HWE deviation is 1e − 6. C Validation by Sanger sequencing of three HWE-deviated SNVs with high FST(Tibetan-Han). The top panel shows the electro-morph of Sanger sequencing of the three SNVs. The histograms in the middle indicate the minor allele frequencies (MAF) of the three SNVs from three datasets, including the WGS data of 1001 Tibetans (in blue), the 96 random samples from the 1001 WGS data (in green), and the Sanger sequencing data of 96 samples (in red). The p values under the histograms indicate the significance levels of HWE deviation of the three SNPs based on the three datasets. D Comparison of the LD decay patterns between Tibetans and other world populations. The dashed box indicates a distinctive LD decay pattern of Tibetans. For the decay of long genomic regions (> 100 kb), Tibetans show a slower decay (reflected by the higher r.2 values) than those of other world populations, an indication of extended haplotype homozygosity. E The correlation of DAF (derived allele frequency) of the genome-wide SNVs from the 1001 Tibetan WGS data and the 3008 Tibetan array data [8] imputed by 1KTGP. F The correlation map when imputed by 1KGP3
Fig. 3
Fig. 3
Genome-wide signals of Darwinian-positive selection in Tibetans. A The distribution of the CMS scores of the genome-wide SNVs in Tibetans. The 192 lead gene regions are marked in red (newly identified genes) and blue dots (reported genes). The top 10 TSNGs are indicated with gene names (4 newly identified and 6 reported). The Venn plot shows the overlap between the reported gene set and the identified gene set in this study. B Functional annotations of the 4320 TSNSs. The “Regulatory region” refers to the noncoding region with regulatory annotations. C The functional enrichment patterns of TSNGs using different methods. The significant terms are indicated in red in the bubble plots. NS, not significant
Fig. 4
Fig. 4
Four newly identified TSNGs in the top 10 signals. AD The regional plots of CMS scores and recombination rates, in which the peaks indicate the selective signals. The peak SNVs are marked with colors. The results of sliding window Fay and Wu’s H tests of the four genes are also presented. A The TMEM132C gene region. B The ATP13A3 gene region. C The SANBR gene region. D The KHDRBS2 gene region. The calculated recombination rates (r2) indicate the estimated linkage disequilibrium (LD) degree between the peak SNV and the other SNVs and are coded in colors. The significance threshold of CMS = 7.66 (top 1‰) is denoted by the red dashed line. The H values refer to the maximum scores of the given regions (marked in red), covering the upstream and downstream 500-kb regions of the peak SNVs of the four genes
Fig. 5
Fig. 5
The polygenic and pleiotropic effects of the 192 TSNGs. The genes are assigned to different organs or physiological systems based on the current functional databases by using GeneORGANizer [51]. The top 10 TSNGs are highlighted by bold font in red (newly identified) and in blue (reported)

Similar articles

Cited by

References

    1. Beall CM, Song K, Elston RC, Goldstein MC. Higher offspring survival among Tibetan women with high oxygen saturation genotypes residing at 4,000 m. Proc Natl Acad Sci U S A. 2004;101:14300–14304. doi: 10.1073/pnas.0405949101. - DOI - PMC - PubMed
    1. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27:2987–2993. doi: 10.1093/bioinformatics/btr509. - DOI - PMC - PubMed
    1. Liu S, Huang S, Chen F, Zhao L, Yuan Y, Francis SS, Fang L, Li Z, Lin L, Liu R, et al. Genomic analyses from non-invasive prenatal testing reveal genetic associations, patterns of viral infections, and Chinese population history. Cell. 2018;175(347–359):e314. doi: 10.1016/j.cell.2018.08.016. - DOI - PubMed
    1. Chen F, Welker F, Shen CC, Bailey SE, Bergmann I, Davis S, Xia H, Wang H, Fischer R, Freidline SE, et al. A late middle Pleistocene Denisovan mandible from the Tibetan Plateau. Nature. 2019;569:409–412. doi: 10.1038/s41586-019-1139-x. - DOI - PubMed
    1. Qi X, Cui C, Peng Y, Zhang X, Yang Z, Zhong H, Zhang H, Xiang K, Cao X, Wang Y, et al. Genetic evidence of paleolithic colonization and neolithic expansion of modern humans on the Tibetan Plateau. Mol Biol Evol. 2013;30:1761–1778. doi: 10.1093/molbev/mst093. - DOI - PubMed

Publication types

Supplementary concepts

LinkOut - more resources