IRProfiler - a software toolbox for high throughput immune receptor profiling

BMC Bioinformatics. 2018 Apr 18;19(1):144. doi: 10.1186/s12859-018-2144-z.


Background: The study of the huge diversity of immune receptors, often referred to as immune repertoire profiling, is a prerequisite for diagnosis, prognostication and monitoring of hematological disorders. In the era of high-throughput sequencing (HTS), the abundance of immunogenetic data has revealed unprecedented opportunities for the thorough profiling of T-cell receptors (TR) and B-cell receptors (BcR). However, the volume of the data to be analyzed mandates for efficient and ease-to-use immune repertoire profiling software applications.

Results: This work introduces Immune Repertoire Profiler (IRProfiler), a novel software pipeline that delivers a number of core receptor repertoire quantification and comparison functionalities on high-throughput TR and BcR sequencing data. Adopting 5 alternative clonotype definitions, IRProfiler implements a series of algorithms for 1) data filtering, 2) calculation of clonotype diversity and expression, 3) calculation of gene usage for the V and J subgroups, 4) detection of shared and exclusive clonotypes among multiple repertoires, and 5) comparison of gene usage for V and J subgroups among multiple repertoires. IRProfiler has been implemented as a toolbox of the Galaxy bioinformatics platform, comprising 6 tools. Theoretical and experimental evaluation has shown that the tools of IRProfiler are able to scale well with respect to the size of input dataset(s). IRProfiler has been utilized by a number of recently published studies concerning hematological disorders.

Conclusion: IRProfiler is made freely available via 3 distribution channels, including the Galaxy Tool Shed. Despite being a new entry in a crowded ecosystem of immune repertoire profiling software, IRProfiler founds its added value on its support for alternative clonotype definitions in conjunction with a combination of properties stemming from its user-centric design, namely ease-of-use, ease-of-access, exploitability of the output data, and analysis flexibility.

Keywords: B-cell receptors; High-throughput sequencing; Immune receptor profiling; Software pipeline; T-cell receptors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Hematologic Diseases / diagnosis
  • Hematologic Diseases / genetics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Receptors, Antigen, B-Cell / genetics*
  • Receptors, Antigen, B-Cell / immunology
  • Receptors, Antigen, T-Cell / genetics*
  • Receptors, Antigen, T-Cell / immunology
  • Sequence Analysis, DNA
  • Software*


  • Receptors, Antigen, B-Cell
  • Receptors, Antigen, T-Cell