Recent advances in the next-generation sequencing of B-cell receptors (BCRs) enable the characterization of humoral responses at a repertoire-wide scale and provide the capability for identifying unique features of immune repertoires in response to disease, vaccination, or infection. Immunosequencing now readily generates 103-105 sequences per sample; however, statistical analysis of these repertoires is challenging because of the high genetic diversity of BCRs and the elaborate clonal relationships among them. To date, most immunosequencing analyses have focused on reporting qualitative trends in immunoglobulin (Ig) properties, such as usage or somatic hypermutation (SHM) percentage of the Ig heavy chain variable (IGHV) gene segment family, and on reducing complex Ig property distributions to simple summary statistics. However, because Ig properties are typically not normally distributed, any approach that fails to assess the distribution as a whole may be inadequate in (1) properly assessing the statistical significance of repertoire differences, (2) identifying how two repertoires differ, and (3) determining appropriate confidence intervals for assessing the size of the differences and their potential biological relevance. To address these issues, we have developed a technique that uses Wilcox' robust statistics toolbox to identify statistically significant vaccine-specific differences between Ig repertoire properties. The advantage of this technique is that it can determine not only whether but also where the distributions differ, even when the Ig repertoire properties are non-normally distributed. We used this technique to characterize murine germinal center (GC) B-cell repertoires in response to a complex Ebola virus-like particle (eVLP) vaccine candidate with known protective efficacy. The eVLP-mediated GC B-cell responses were highly diverse, consisting of thousands of clonotypes. Despite this staggering diversity, we identified statistically significant differences between non-immunized, vaccine only, and vaccine-plus-adjuvant groups in terms of Ig properties, including IGHV-family usage, SHM percentage, and characteristics of the BCR complementarity-determining region. Most notably, our analyses identified a robust eVLP-specific feature-enhanced IGHV8-family usage in B-cell repertoires. These findings demonstrate the utility of our technique in identifying statistically significant BCR repertoire differences following vaccination. More generally, our approach is potentially applicable to a wide range of studies in infection, vaccination, auto-immunity, and cancer.
Keywords: B cell; Ebola; clonotype; germinal center; immunoglobulin; immunosequencing; repertoire properties; statistical analysis.