Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles

Hum Mol Genet. 2019 Jun 15;28(12):2078-2092. doi: 10.1093/hmg/ddy443.

Abstract

Genotype imputation of the human leukocyte antigen (HLA) region is a cost-effective means to infer classical HLA alleles from inexpensive and dense SNP array data. In the research setting, imputation helps avoid costs for wet lab-based HLA typing and thus renders association analyses of the HLA in large cohorts feasible. Yet, most HLA imputation reference panels target Caucasian ethnicities and multi-ethnic panels are scarce. We compiled a high-quality multi-ethnic reference panel based on genotypes measured with Illumina's Immunochip genotyping array and HLA types established using a high-resolution next generation sequencing approach. Our reference panel includes more than 1,300 samples from Germany, Malta, China, India, Iran, Japan and Korea and samples of African American ancestry for all classical HLA class I and II alleles including HLA-DRB3/4/5. Applying extensive cross-validation, we benchmarked the imputation using the HLA imputation tool HIBAG, our multi-ethnic reference and an independent, previously published data set compiled of subpopulations of the 1000 Genomes project. We achieved average imputation accuracies higher than 0.924 for the commonly studied HLA-A, -B, -C, -DQB1 and -DRB1 genes across all ethnicities. We investigated allele-specific imputation challenges in regard to geographic origin of the samples using sensitivity and specificity measurements as well as allele frequencies and identified HLA alleles that are challenging to impute for each of the populations separately. In conclusion, our new multi-ethnic reference data set allows for high resolution HLA imputation of genotypes at all classical HLA class I and II genes including the HLA-DRB3/4/5 loci based on diverse ancestry populations.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • African Americans / ethnology
  • African Americans / genetics
  • Alleles
  • Asian Continental Ancestry Group
  • Benchmarking
  • Cluster Analysis
  • Ethnic Groups
  • European Continental Ancestry Group / ethnology
  • European Continental Ancestry Group / genetics
  • Gene Frequency
  • Genotype
  • HLA Antigens / genetics
  • HLA-DRB3 Chains / genetics
  • HLA-DRB4 Chains / genetics
  • HLA-DRB5 Chains / genetics
  • Haplotypes
  • High-Throughput Nucleotide Sequencing
  • Histocompatibility Antigens Class I / genetics*
  • Histocompatibility Antigens Class II / genetics*
  • Humans
  • Polymorphism, Single Nucleotide
  • Retrospective Studies

Substances

  • HLA Antigens
  • HLA-DRB3 Chains
  • HLA-DRB4 Chains
  • HLA-DRB5 Chains
  • Histocompatibility Antigens Class I
  • Histocompatibility Antigens Class II