Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb;47(2):172-9.
doi: 10.1038/ng.3176. Epub 2015 Jan 5.

High-density Mapping of the MHC Identifies a Shared Role for HLA-DRB1*01:03 in Inflammatory Bowel Diseases and Heterozygous Advantage in Ulcerative Colitis

Collaborators, Affiliations
Free PMC article

High-density Mapping of the MHC Identifies a Shared Role for HLA-DRB1*01:03 in Inflammatory Bowel Diseases and Heterozygous Advantage in Ulcerative Colitis

Philippe Goyette et al. Nat Genet. .
Free PMC article


Genome-wide association studies of the related chronic inflammatory bowel diseases (IBD) known as Crohn's disease and ulcerative colitis have shown strong evidence of association to the major histocompatibility complex (MHC). This region encodes a large number of immunological candidates, including the antigen-presenting classical human leukocyte antigen (HLA) molecules. Studies in IBD have indicated that multiple independent associations exist at HLA and non-HLA genes, but they have lacked the statistical power to define the architecture of association and causal alleles. To address this, we performed high-density SNP typing of the MHC in >32,000 individuals with IBD, implicating multiple HLA alleles, with a primary role for HLA-DRB1*01:03 in both Crohn's disease and ulcerative colitis. Noteworthy differences were observed between these diseases, including a predominant role for class II HLA variants and heterozygous advantage observed in ulcerative colitis, suggesting an important role of the adaptive immune response in the colonic environment in the pathogenesis of IBD.

Conflict of interest statement

Competing Financial Interests

The authors declare no competing financial interests.


Figure 1
Figure 1. Primary univariate association analyses of CD and UC
Univariate association analysis results for 8,939 SNPs (dark grey) (Supplementary Table 12), 90 2-digit and 138 4-digit resolution HLA alleles (yellow) (Supplementary Table 13), as well as 741 single amino acid variants (red) (Supplementary Table 4) in the MHC region are shown for 18,405 CD cases and 14,308 UC cases (with 34,241 common control subjects). Given that previous genetic analyses have identified distinct effects in the MHC for CD and UC, with different non-correlated alleles identified in each disease, we opted to perform the finemapping analyses for CD and UC separately. (a) The primary univariate association analysis in CD reveals over 1,789 markers showing study-wide significant association (P<5×10−6) across the MHC, including 32 4-digit resolution classical HLA alleles (Fig. 3 and Supplementary Table 2). The single most significant variant for CD is HLA-DRB1*01:03 (P=3×10−62, OR= 2.51). (b) The primary univariate association analysis in UC reveals over 2,762 markers showing study-wide significant association across the MHC, including 50 4-digit resolution classical HLA alleles (Fig. 3 and Supplementary Table 3). The single most significant variant for UC is rs6927022 (P=8×10−154, OR= 1.49) while the best HLA allele is HLA-DRB1*01:03 (P=3×10−119, OR=3.59); each acting independently. Twenty-nine SNPs and 9 amino acid variants surpass HLA-DRB1*01:03 as the next most significant variants in the primary analysis however all of these are correlated to rs6927022 and their significance is dramatically reduced by conditional logistic regression.
Figure 2
Figure 2. Variance explained by 4-digit HLA alleles in CD and UC
Proportion of variance explained on a logit scale (McKelvey and Zavoina’s Pseudo R2, see Online Methods) for different models in CD (left) and UC (right). The top boxes show the variance explained by previously identified GWAS index SNPs within the MHC. The middle boxes illustrate the variance explained by HLA models including all 4-digit alleles of frequency > 0.5% (126 alleles in CD and UC) and models restricted to 4-digit alleles within either class I (63 alleles) or class II regions (63 alleles), respectively. The Venn diagram illustrates the proportion of variance explained that is unique to class I, class II or shared. The bottom boxes indicate the variance explained by the proposed HLA models (15 and 16 alleles in CD and UC, respectively). To be noted, these estimations of variance explained were performed on the logit scale for practical reasons, and should not be directly compared to heritability estimates computed on the (Gaussian) liability scale.
Figure 3
Figure 3. Correlated association signals at HLA alleles support potential alternate association models for both CD and UC
Equivalence of effect at the different study-wide significant associated 4-digit HLA alleles is shown for (a) CD and (b) UC. The structures illustrated in the figure are not classically defined haplotype structures, but were identified entirely based on the correlation of signal defined through pairwise reciprocal conditional logistic regression analyses (see Supplementary tables 2 and 3); although such correlations are clearly dependent on the underlying haplotypic structure of the region. Alleles identified as primary tags for independent association signals in our HLA-DRB1 focused models are shown in light blue boxes, while alternate alleles with equivalent effects are shown in grey boxes. Alleles in white boxes show study-wide significant secondary effects that can be explained entirely by the selected HLA alleles. Alleles at the HLA-DRB3, -DRB4 and -DRB5 genes were omitted in order to simplify the display; many of the alleles at these genes show high frequency and as such are correlated to many different alleles (both risk and protective) at the other class II genes. Of note, the HLA-DRB4*null allele is the second strongest associated allele in UC (see Supplementary table 3).
Figure 4
Figure 4. HLA-DR peptide binding groove electrostatic properties and risk of IBD
The electrostatic potential of all HLA-DR alleles associated with UC or CD, and of all common HLA-DR alleles (frequency >1%), was calculated. HLA-DR alleles associated with increased or decreased risk of IBD at study wide-significance level (P< 5×10−6) are shown in dark red or dark blue, respectively. Respective risk associations at suggestive level (1×10−4<P<5×10−6) are shown in pale red and pale blue. Electrostatic potential comparisons among HLA-DR molecules were performed in a pairwise, all-versus-all, fashion (see Online Methods) to produce distance matrices that are displayed as symmetrical heatmaps (scale ranges from 0 [identical] to 1 [maximum difference]). (a) The electrostatic potential in seven regions within the peptide binding groove (see Online Methods and Supplementary Fig. 10), which interact with the presented peptide, were compared among the HLA-DR alleles and pooled onto a single Euclidian distance matrix. The distance-based clustering identifies four clusters, with an enrichment of risk alleles in two of these. Comparison of the electrostatic potential at individual peptide binding groove regions is shown in Supplementary Fig. 13. (b) Heatmap representing electrostatic potential differences among the HLA-DR alleles at a spherical region that encompasses amino acid residues 67, 70 and 71 of the HLA-DRβ chain (associated with risk for UC and CD; Supplementary Table 13). The distance-based clustering identifies two clusters that correlate with directionality of effect in IBD.
Figure 5
Figure 5. Non-additive effect models in CD and UC
Evidence for non-additive effect of common variants (frequency >5%) across the MHC tested under a general model of additive and dominance effects (Online Methods) in CD (a,b) and UC (c,d). The p-values and directionality for departure from additive effect (dominance term) are represented on the y-axis (a,c). HLA alleles and amino acids variants are in yellow and red respectively, while SNPs are represented in dark grey. Variants with non-significant (P>5×10−6) dominance term are plotted in less pronounced colors. A clear enrichment for lower risk in heterozygotes is observed in UC (c) as suggested by the large number of significant negative dominance term (lower part of the plot). This effect is absent in CD (a), or much less important. The dominance term OR is illustrated (y-axis) versus the additive term (x-axis) (b,d). Protective and risk minor alleles are shown on the left and right sides of the plot respectively. Strictly recessive or dominant variants are expected to fall on the diagonals, while strictly additive variants lay on or close to the x-axis. The y-axis is the expected position for pure over/under dominance. In UC (d), many alleles fall into the region of the plot for protective dominant, risk recessive or overdominance (blue triangle) (see Supplementary Table 9 for pairwise comparison of HLA-DRB1 alleles). These non-additive effects are observed for many variants in UC (c,d) (e.g HLA-DRB1*03:01 and HLA-DQB1*02:01) but are mostly absent in CD (a,b); notable exception being the HLA-B*08 allele (Supplementary Fig. 6).
Figure 6
Figure 6. Comparison of odds ratio in CD and UC for HLA alleles identified from HLA-focused models
Odds ratio (OR) from the primary univariate association analyses in CD and UC for all alleles identified in the HLA-focused models of CD and/or UC are presented with 95% confidence intervals (a). Odds ratio for CD and UC are in blue and red respectively; darker colors indicate study-wide significant effect (P<5×10−6), lighter colors indicate nominal significance level (0.05>P 5×10−6) and white indicates non-significance (P 0.05) (for specific effect and significance values refer to Fig. 3 and Supplementary Tables 2 and 3). Allele HLA-B*52:01 is indicated for UC in place of the equivalent HLA-C*12:02 to simplify the display of this shared signal. For the same HLA alleles, odds ratio (with 95% confidence intervals) for an IBD analysis are plotted against the odds ratio for the CD versus UC analysis with the IBD risk allele as the reference (b). Empty circles represent variants where the absence of the allele is the reference. Alleles identified as significant in CD or UC only are plotted in blue and red, respectively. Variants identified as significant in both are shown in purple. To be noted, HLA-DRB1*07:01 and HLA-DRB1*13:02 have opposite direction of effect between CD and UC. Shared association signals are expected to fall in the upper triangle of the plot. Most variants fall outside of this region, highlighting the difference between CD and UC in the MHC.

Similar articles

See all similar articles

Cited by 80 articles

See all "Cited by" articles


    1. Horton R, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5:889–99. - PubMed
    1. Rioux JD, et al. Mapping of multiple susceptibility variants within the MHC region for 7 immune-mediated diseases. Proc Natl Acad Sci U S A. 2009;106:18680–5. - PMC - PubMed
    1. Stokkers PC, Reitsma PH, Tytgat GN, van Deventer SJ. HLA-DR and -DQ phenotypes in inflammatory bowel disease: a meta-analysis. Gut. 1999;45:395–401. - PMC - PubMed
    1. Jostins L, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491:119–24. - PMC - PubMed
    1. Achkar JP, et al. Amino acid position 11 of HLA-DRbeta1 is a major determinant of chromosome 6p association with ulcerative colitis. Genes Immun. 2012;13:245–52. - PMC - PubMed

Publication types

LinkOut - more resources