HLA-HD: An accurate HLA typing algorithm for next-generation sequencing data

Hum Mutat. 2017 Jul;38(7):788-797. doi: 10.1002/humu.23230. Epub 2017 May 12.


The accurate typing of human leukocyte antigen (HLA) alleles is critical for a variety of medical applications, such as genomic studies of multifactorial diseases, including immune system and inflammation-related disorders, and donor selection in organ transplantation and regenerative medicine. Here, we developed a new algorithm for determining HLA alleles using next-generation sequencing (NGS) results. The method consists of constructing an extensive dictionary of HLA alleles, precise mapping of the NGS reads, and calculating a score based on weighted read counts to select the most suitable pair of alleles. The developed algorithm compares the score of all allele pairs, taking into account variation not only in the domain for antigen presentation (G-DOMAIN), but also outside this domain. Using this method, HLA alleles could be determined with 6-digit precision. We showed that our method was more accurate than other NGS-based methods and revealed limitations of the conventional HLA typing technologies. Furthermore, we determined the complete genomic sequence of an HLA-A-like-pseudogene when we assembled NGS reads that had caused arguable typing, and found its identity with HLA-Y*02:01. The accuracy of the HLA-A allele typing was improved after the HLA-Y*02:01 sequence was included in the HLA allele dictionary.

Keywords: HLA gene; genotyping; next-generation sequencing; software.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alleles
  • Chromosome Mapping
  • Cloning, Molecular
  • Computational Biology / methods*
  • DNA Primers
  • Databases, Factual
  • Exons
  • Genome, Human
  • Genomics
  • HLA Antigens / biosynthesis
  • HLA Antigens / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Histocompatibility Testing*
  • Humans
  • Models, Statistical
  • Polymerase Chain Reaction
  • Pseudogenes
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*


  • DNA Primers
  • HLA Antigens