Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 4;98(2):299-309.
doi: 10.1016/j.ajhg.2015.12.023.

A Burden of Rare Variants Associated with Extremes of Gene Expression in Human Peripheral Blood

Affiliations

A Burden of Rare Variants Associated with Extremes of Gene Expression in Human Peripheral Blood

Jing Zhao et al. Am J Hum Genet. .

Abstract

In order to evaluate whether rare regulatory variants in the vicinity of promoters are likely to impact gene expression, we conducted a novel burden test for enrichment of rare variants at the extremes of expression. After sequencing 2-kb promoter regions of 472 genes in 410 healthy adults, we performed a quadratic regression of rare variant count on bins of peripheral blood transcript abundance from microarrays, summing over ranks of all genes. After adjusting for common eQTLs and the major axes of gene expression covariance, a highly significant excess of variants with minor allele frequency less than 0.05 at both high and low extremes across individuals was observed. Further enrichment was seen in sites annotated as potentially regulatory by RegulomeDB, but a deficit of effects was associated with known metabolic disease genes. The main result replicates in an independent sample of 75 individuals with RNA-seq and whole-genome sequence information. Three of four predicted large-effect sites were validated by CRISPR/Cas9 knockdown in K562 cells, but simulations indicate that effect sizes need not be unusually large to produce the observed burden. Unusually divergent low-frequency promoter haplotypes were observed at 31 loci, at least 9 of which appear to be derived from Neandertal admixture, but these were not associated with divergent gene expression in blood. The overall burden test results are consistent with rare and private regulatory variants driving high or low transcription at specific loci, potentially contributing to disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schema Showing the Pooling Strategy to Evaluate Rare Variant Enrichment For each gene, the normalized gene expression measures across all 410 individuals are sorted into 82 bins, resulting in somewhat normal frequency distributions shown in the top panels. Subsequently, the number of rare variants in the 2-kb promoter of each allele in that bin is tallied: for example, there are 2, 1, 0, 0, and 1 rare variants in the promoters of the 5 individuals (both alleles) in the second bin for gene 1, summing to 4, whereas the second bin for gene 2 has 3 rare variants. These expression bin rare allele counts are then summed over all 472 genes and plotted from lowest to highest bin to yield plots at the bottom of the figure that represent two alternative results. In the absence of a burden of rare variants at the extremes, there is neither a significant slope nor quadratic fit (left plot), whereas an excess of variants at both extremes produces a concave “smile” regression (right plot). If there were an excess at only the low or high expression, the linear slope would be significant.
Figure 2
Figure 2
Relationship between Rare Variant Counts and Transcript Abundance Each plot shows the cumulative number of rare variants in equal sized bins across the indicated number of genes and individuals, with lowest expression bin to the left and highest to the right. Lines indicate the best fit quadratic model. (A) 472 genes in 410 individuals of mixed ethnicity (5 individuals per bin), gene expression data normalized by SNM with variance adjustment (whole model R2 = 0.19, p = 0.0003). (B) 472 genes in 279 Europeans (93 bins of 3 individuals) after removing effects of common eQTLs and conserved axes of covariation (R2 = 0.17, p = 0.0003). (C) 4,633 genes in 75 replicates after removing effects of conserved axes of covariation and PC1 (R2 = 0.49, p = 2 × 10−11).
Figure 3
Figure 3
Genes with a Low Ratio of Promoter to Coding Polymorphism Are Intolerant to Large Effect Regulatory Variants (A) Regression of π5′P on πcod highlighting the bottom 10% of genes with the largest negative residuals (lowest ratio, red) and top 10% (blue). (B) Plot of relationship between the number of variants per gene in the top decile of estimated effect size from all 8,833 rare variants (MAF < 0.05) and residual from the regression in (A) as a measure of relative promoter polymorphism.
Figure 4
Figure 4
Estimation of Effect Sizes of Rare Variants (A) The distribution of absolute values of the estimated effect size from the CHDWB data as a function of the number of alleles in the sample. Boxes show mean and interquartile range with whiskers at 1.5 times IQR with outliers as single points. (B) Comparison of estimated effect size distribution from the data (bold curve) and in the simulation (thin black curve) showing slight excess of larger effect variants in the observed data. (C) Simulated model fit assuming a gamma distribution with gamma(1.5,0.12), for 472 genes in 279 individuals, showing excess of extreme expression as in the actual data (R2 = 0.18, p = 0.0002).

Similar articles

Cited by

References

    1. Deciphering Developmental Disorders Study Large-scale discovery of novel genetic causes of developmental disorders. Nature. 2015;519:223–228. - PMC - PubMed
    1. Purcell S.M., Moran J.L., Fromer M., Ruderfer D., Solovieff N., Roussos P., O’Dushlaine C., Chambert K., Bergen S.E., Kähler A. A polygenic burden of rare disruptive mutations in schizophrenia. Nature. 2014;506:185–190. - PMC - PubMed
    1. Sanders S.J., Murtha M.T., Gupta A.R., Murdoch J.D., Raubeson M.J., Willsey A.J., Ercan-Sencicek A.G., DiLullo N.M., Parikshak N.N., Stein J.L. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012;485:237–241. - PMC - PubMed
    1. Hunt K.A., Mistry V., Bockett N.A., Ahmad T., Ban M., Barker J.N., Barrett J.C., Blackburn H., Brand O., Burren O. Negligible impact of rare autoimmune-locus coding-region variants on missing heritability. Nature. 2013;498:232–235. - PMC - PubMed
    1. Gusev A., Lee S.H., Trynka G., Finucane H., Vilhjálmsson B.J., Xu H., Zang C., Ripke S., Bulik-Sullivan B., Stahl E., Schizophrenia Working Group of the Psychiatric Genomics Consortium. SWE-SCZ Consortium. Schizophrenia Working Group of the Psychiatric Genomics Consortium. SWE-SCZ Consortium Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 2014;95:535–552. - PMC - PubMed

Publication types