Clustering and alignment of polymorphic sequences for HLA-DRB1 genotyping

PLoS One. 2013;8(3):e59835. doi: 10.1371/journal.pone.0059835. Epub 2013 Mar 28.

Abstract

Located on Chromosome 6p21, classical human leukocyte antigen genes are highly polymorphic. HLA alleles associate with a variety of phenotypes, such as narcolepsy, autoimmunity, as well as immunologic response to infectious disease. Moreover, high resolution genotyping of these loci is critical to achieving long-term survival of allogeneic transplants. Development of methods to obtain high resolution analysis of HLA genotypes will lead to improved understanding of how select alleles contribute to human health and disease risk. Genomic DNAs were obtained from a cohort of n = 383 subjects recruited as part of an Ulcerative Colitis study and analyzed for HLA-DRB1. HLA genotypes were determined using sequence specific oligonucleotide probes and by next-generation sequencing using the Roche/454 GSFLX instrument. The Clustering and Alignment of Polymorphic Sequences (CAPSeq) software application was developed to analyze next-generation sequencing data. The application generates HLA sequence specific 6-digit genotype information from next-generation sequencing data using MUMmer to align sequences and the R package diffusionMap to classify sequences into their respective allelic groups. The incorporation of Bootstrap Aggregating, Bagging to aid in sorting of sequences into allele classes resulted in improved genotyping accuracy. Using Bagging iterations equal to 60, the genotyping results obtained using CAPSeq when compared with sequence specific oligonucleotide probe characterized 4-digit genotypes exhibited high rates of concordance, matching at 759 out of 766 (99.1%) alleles.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alleles
  • Cluster Analysis
  • Colitis, Ulcerative / genetics*
  • Genotype
  • Genotyping Techniques
  • HLA-DRB1 Chains / genetics*
  • Humans
  • Oligonucleotide Probes
  • Polymorphism, Genetic*
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods
  • Software

Substances

  • HLA-DRB1 Chains
  • Oligonucleotide Probes