Data obtained with cytometry are increasingly complex and their interrogation impacts the type and quality of knowledge gained. Conventional supervised analyses are limited to pre-defined cell populations and do not exploit the full potential of data. Here, in the context of a clinical trial of cancer patients treated with radiotherapy, we performed longitudinal flow cytometry analyses to identify multiple distinct cell populations in circulating whole blood. We cross-compared the results from state-of-the-art recommended supervised analyses with results from MegaClust, a high-performance data-driven clustering algorithm allowing fast and robust identification of cell-type populations. Ten distinct cell populations were accurately identified by supervised analyses, including main T, B, dendritic cell (DC), natural killer (NK) and monocytes subsets. While all ten subsets were also identified with MegaClust, additional cell populations were revealed (e.g. CD4+HLA-DR+ and NKT-like subsets), and DC profiling was enriched by the assignment of additional subset-specific markers. Comparison between transcriptomic profiles of purified DC populations and publicly available datasets confirmed the accuracy of the unsupervised clustering algorithm and demonstrated its potential to identify rare and scarcely described cell subsets. Our observations show that data-driven analyses of cytometry data significantly enrich the amount and quality of knowledge gained, representing an important step in refining the characterization of immune responses.
Keywords: analytical immunology; data-driven analysis; flow cytometry; immune monitoring; unsupervised clustering.
Copyright © 2021 Baumgaertner, Sankar, Herrera, Benedetti, Barras, Thierry, Dangaj, Kandalaft, Coukos, Xenarios, Guex and Harari.