Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr;51(4):683-693.
doi: 10.1038/s41588-019-0362-6. Epub 2019 Mar 11.

Interrogation of Human Hematopoiesis at Single-Cell and Single-Variant Resolution

Free PMC article

Interrogation of Human Hematopoiesis at Single-Cell and Single-Variant Resolution

Jacob C Ulirsch et al. Nat Genet. .
Free PMC article


Widespread linkage disequilibrium and incomplete annotation of cell-to-cell state variation represent substantial challenges to elucidating mechanisms of trait-associated genetic variation. Here we perform genetic fine-mapping for blood cell traits in the UK Biobank to identify putative causal variants. These variants are enriched in genes encoding proteins in trait-relevant biological pathways and in accessible chromatin of hematopoietic progenitors. For regulatory variants, we explore patterns of developmental enhancer activity, predict molecular mechanisms, and identify likely target genes. In several instances, we localize multiple independent variants to the same regulatory element or gene. We further observe that variants with pleiotropic effects preferentially act in common progenitor populations to direct the production of distinct lineages. Finally, we leverage fine-mapped variants in conjunction with continuous epigenomic annotations to identify trait-cell type enrichments within closely related populations and in single cells. Our study provides a comprehensive framework for single-variant and single-cell analyses of genetic associations.

Conflict of interest statement

Competing interests

The authors declare no competing interests.


Figure 1 |
Figure 1 |. Overview of hematopoiesis, UKB GWAS, and fine-mapping.
(a) Schematic of the human hematopoietic hierarchy showing the primary cell types analyzed in this work. Colors used in this schematic are consistent throughout all figures. Mono, monocyte; gran, granulocyte; ery, erythroid; mega, megakaryocyte; CD4, CD4+ T cell; CD8, CD8+ T cell; B, B cell; NK, natural killer cell; mDC, myeloid dendritic cell; pDC, plasmacytoid dendritic cell. The 16 blood traits that were genetically fine-mapped are shown below the hierarchy. (b) Schematic of UKB GWAS and fine-mapping approach. Briefly, blood traits from ~115K individuals were fine-mapped allowing for multiple causal variants and using imputed genotype dosages as reference LD. (c) Number of fine-mapped regions for each trait with the highest posterior probability for a variant being causal indicated. (d) Breakdown of the number of causal variants (min = 1, max = 5) for all regions in each trait. (e) Empirical distribution of the minor allele frequency of variants in each posterior bin. (f) Proportion of fine-mapped variants within intronic, promoter, coding, UTR, and intergenic regions. (g) Local-shifting enrichments of fine-mapped variants across all traits for varying posterior probability bins.
Figure 2 |
Figure 2 |. Mechanisms of core gene regulation in blood production.
(a,b) Heatmaps depicting red blood cell trait-associated variants (PP > 0.10) across the erythroid lineage (a) and lymphocyte count-associated variants (PP > 0.10) across the lymphoid lineage (b), clustered by chromatin accessibility. Each row marks a fine-mapped variant, each column denotes a cell type within the relevant lineage, and color denotes relative chromatin accessibility along the lineage at each variant (blue, least accessible chromatin; red, most accessible chromatin). Putative target genes (predicted by ATAC-RNA correlation and/or PCHi-C) and disrupted TFs (predicted by ChIP-seq occupancy and motif disruption) are indicated to the right. (c) Transcription factor motifs disrupted in lineage-specific hematopoietic traits. Each row represents a set of traits where variants disrupt specified TF motifs and are occupied by that TF in hematopoietic cells. The unique margin sums across each lineage are shown in the bar plot for each TF. The expected number of variants with ChIP + motif disruption across all PPs is estimated using 100,000 permutations and is shown as a single point. (d) Examples of molecular mechanisms from the analysis in c reveals putative causal variants that disrupt cis-binding of hematopoietic TFs known to be involved in regulating hematopoiesis for various blood cell traits: rs10758656 and rs66480687 are associated with red blood cell traits; rs75522380 and rs74340846 are associated with platelet traits; rs4970966 is associated with monocyte count; and rs79716587 is associated with lymphocyte count. Black color represents accessibility throughout hematopoiesis, whereas other stacked colors represent accessibility for the cell types shown in Figure 3d. (e) Examples of putative target genes from the analysis in a and b: rs11642657 and rs12151289 are associated with monocyte count; rs73660574 is associated with red blood cell traits; rs553535973 is associated with lymphocyte count; and rs114694170 is associated with platelet traits. Colors for accessible chromatin are the same as in d.
Figure 3 |
Figure 3 |. Characterization and validation of CCND3 and AK3 regions with multiple causal variants.
(a,b) Regional association plots (n = 116,667 individuals; BOLT-LMM P-values) for RBC count in the CCND3 locus from the initial GWAS (a) and after conditioning on the sentinel variant rs9349205 (b). (c,d) Fine-mapping identifies two putative causal variants (rs9349205, PP = 0.94; rs112233623, PP = 0.99) located 161 bp apart (c), both of which lie within the same erythroid-specific accessible chromatin (AC) (d). (e) Luciferase reporter assays for four haplotypes (left) corroborate independent additive effects of rs9349205 (red; P = 1.78 × 10−3) and rs112233623 (blue; P = 2.86 × 10−6) on RBC count (right). (f,g) Regional association plots (n = 116,666 individuals, BOLT-LMM P-values) for platelet count in the AK3 locus from the initial GWAS (f) and after conditioning on sentinel variant rs12005199 (g). (h,i) Fine-mapping identifies two putative causal variants (rs12005199, PP = 0.99; rs409950, PP = 0.99) 123 bp apart (h), both located within a strong megakaryocyte AC region (i). (j) Luciferase reporter assays (n = 9 biological replicates) for four haplotypes (left) corroborate independent additive effects of rs12005199 (red; two-sided Wald test P = 5.19 × 10−4) and rs409950 (blue; two-sided Wald test P = 3.57 × 10−5) on platelet count (right). Mean and standard error are indicated for both phenotype and regulatory activity.
Figure 4 |
Figure 4 |. Dissecting mechanisms of pleiotropic variants across multiple blood cell lineages.
(a) Schematic that illustrates fine-mapped variants acting in multi-potential or heterogeneous progenitors on distinct hematopoietic lineages, either by tuning lineages in the same direction or switching the regulation in opposite directions. (b) A heatmap depicting 172 fine-mapped variants (PP > 0.10) with pleiotropic effects on cell counts in two or more hematopoietic lineages (eosinophil, neutrophil, basophil, lymphocyte, monocyte, platelet, RBC). Effects on eosinophil, neutrophil, and basophil counts are visualized together as a singular granulocyte lineage. Genomic annotations are indicated below each variant. (c) Pleiotropic variant rs78744187, located downstream of CEBPA, has high chromatin accessibility in CMP and MEP progenitors (top) and demonstrates a switch mechanism by downregulating basophil count while upregulating RBC count (bottom). (d) rs218265, located upstream of stem cell factor KIT, has high chromatin accessibility in several early progenitors (HSC, MPP, CMP, MEP) and demonstrates a switch mechanism by upregulating neutrophil and WBC count while downregulating RBC count. (e) rs17758695, located within an intron of anti-apoptotic factor BCL2, has high chromatin accessibility in several early progenitors (HSC, MPP, CMP, MEP) and exhibits a tuning mechanism, simultaneously downregulating eosinophil, monocyte, and RBC counts.
Figure 5 |
Figure 5 |. Overview of g-chromVAR and application to hematopoietic cell types.
(a) Schematic showing inputs for continuous epigenomic data for each cell type and a matrix of fine-mapped variant posterior probabilities for GWAS traits. (b-d) Results from the application of g-chromVAR and three similar methods to 16 blood cell traits for 18 hematopoietic cell types. (b) Quantile-quantile representation of the P-values from each method. (c) Overlap between methods for Bonferroni-corrected trait enrichments. (d) Lineage enrichment of all trait-pairs (n = 288 pairs) for each method. A two-tailed Mann-Whitney rank-sum test was used to evaluate the relative enrichment of lineage-specific trait-cell type pairs (true positives). (e-h) Enrichments for four representative traits using g-chromVAR: mean corpuscular volume (e); mean platelet volume (f); monocyte count (g); lymphocyte count (h).
Figure 6 |
Figure 6 |. Application of g-chromVAR to single-cell chromatin accessibility data.
(a) 2,034 hematopoietic single cells projected onto a three-dimensional principal components embedding. Single cells colored by g-chromVAR enrichment scores for mean reticulocyte volume reveal specific regulatory enrichment in the MEP population. (b) Validation of g-chromVAR enrichments using synthetic bulk populations from sums of single cells (n = 2,034 cells). Aggregated single-cell g-chromVAR z-scores across all trait-cell type pairs (individual points) strongly correlate (Pearson r = 0.84) with bulk population z-scores. (c) Inferred pseudotime trajectories of three hematopoietic lineages from scATAC-seq data. (d) Pseudotime trends (mean and 95% CIs) of g-chromVAR scores for platelet count across all single cells (n = 2,034 cells) corroborates regulatory dynamics of megakaryocyte/erythroid differentiation. (e) Rank order plot highlighting the trait-cell type pairs with the greatest variance over that of a χ2 distribution. (f) K-medoids partitioning of ATAC-seq counts in CMP cells (n = 502 cells) reveals two subpopulations: one that is enriched for monocyte genetic variants and one that is enriched for megakaryocyte/erythroid variants (RBC count, FDR = 1.28 × 10−4; MPV, FDR = 2.36 × 10−4; platelet count, FDR = 1.40 × 10−5; monocyte count, FDR = 2.21 × 10−2). ChromVAR scores for master transcription factors (TFs) of each blood cell type support biological hypotheses for genetic enrichments (GATA1, FDR = 1.76 × 10−82; KLF1, FDR = 4.33 × 10−3; CEBPA, FDR = 2.58 × 10−16; IRF8, FDR = 4.65 × 10−15). Two-tailed t-tests were used for each comparison; boxplots represent median and interquartile range. (g) Similar k-medoids partitioning of MEP cells (n = 138 cells) reveals two subpopulations with differential enrichments for megakaryocyte or erythroid associated genetic variants (RBC count, FDR = 0.155; HCT, FDR = 3.98 × 10−2; platelet count, FDR = 7.65 × 10−2), along with consistent differences in chromVAR TF-deviation scores for master TFs of each blood cell type (GATA1, FDR = 2.18 × 10−4; KLF1, FDR = 4.02 × 10−6; MEF2C, FDR = 2.52 × 10−3).

Comment in

Similar articles

See all similar articles

Cited by 12 articles

See all "Cited by" articles


    1. Doulatov S, Notta F, Laurenti E & Dick JE Hematopoiesis: a human perspective. Cell Stem Cell 10, 120–36 (2012). - PubMed
    1. Sankaran VG & Orkin SH Genome-wide association studies of hematologic phenotypes: a window into human hematopoiesis. Curr Opin Genet Dev 23, 339–44 (2013). - PMC - PubMed
    1. Astle WJ et al. The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease. Cell 167, 1415–1429 e19 (2016). - PMC - PubMed
    1. Boyle EA, Li YI & Pritchard JK An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell 169, 1177–1186 (2017). - PMC - PubMed
    1. Farh KK et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–43 (2015). - PMC - PubMed

Methods-Only References

    1. Loh PR et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 47, 284–90 (2015). - PMC - PubMed
    1. Hormozdiari F et al. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat Genet 50, 1041–1047 (2018). - PMC - PubMed
    1. Yu A et al. Comparison of human genetic and sequence-based physical maps. Nature 409, 951–3 (2001). - PubMed
    1. McLaren W et al. The Ensembl Variant Effect Predictor. Genome Biol 17, 122 (2016). - PMC - PubMed
    1. Watanabe K, Taskesen E, van Bochoven A & Posthuma D Functional mapping and annotation of genetic associations with FUMA. Nat Commun 8, 1826 (2017). - PMC - PubMed

Publication types

MeSH terms