The major histocompatibility complex (MHC) is located in chromosome 6p21 and contains crucial regulators of immune response, including human leucocyte antigen (HLA) genes, alongside other genes with nonimmunological roles. More recently, a repertoire of noncoding RNA genes, including expressed pseudogenes, has also been identified. The MHC is the most gene dense and most polymorphic part of the human genome. The region exhibits haplotype-specific linkage disequilibrium patterns, contains the strongest cis- and trans-eQTLs/meQTLs in the genome and is known as a hot spot for disease associations. Another layer of complexity is provided to the region by the extreme structural variation and copy number variations. While the HLA-B gene has the highest number of alleles, the HLA-DR/DQ subregion is structurally most variable and shows the highest number of disease associations. Reliance on a single reference sequence has complicated the design, execution and analysis of GWAS for the MHC region and not infrequently, the MHC region has even been excluded from the analysis of GWAS data. Here, we contrast features of the MHC region with the rest of the genome and highlight its complexities, including its functional polymorphisms beyond those determined by single nucleotide polymorphisms or single amino acid residues. One of the several issues with customary GWAS analysis is that it does not address this additional layer of polymorphisms unique to the MHC region. We highlight alternative approaches that may assist with the analysis of GWAS data from the MHC region and unravel associations with all functional polymorphisms beyond single SNPs. We suggest that despite already showing the highest number of disease associations, the true extent of the involvement of the MHC region in disease genetics may not have been uncovered.
Keywords: HLA complex; disease predisposition; genetic predisposition to disease; genetic variation; genome biology; genomewide association studies.
© 2017 John Wiley & Sons Ltd.