Motivation: For immune system monitoring in large-scale studies at the single-cell resolution using CyTOF, (semi-)automated computational methods are applied for annotating live cells of mixed cell types. Here, we show that the live cell pool can be highly enriched with undefined heterogeneous cells, i.e., 'ungated' cells, and that current semi-automated approaches ignore their modeling resulting in misclassified annotations.
Result: We introduce 'CyAnno', a novel semi-automated approach for deconvoluting the unlabeled cytometry dataset based on a machine learning framework utilizing manually gated training data that allows the integrative modeling of 'gated' cell types and the 'ungated' cells. By applying this framework on several CyTOF datasets, we demonstrated that including the 'ungated' cells can lead to a significant increase in the precision of the 'gated' cell types prediction. CyAnno can be used to identify even a single cell type, including rare cells, with higher efficacy than current state-of-the-art semi-automated approaches.
Availability: The CyAnno is available as a python script with a user-manual and sample dataset at https://github.com/abbioinfo/CyAnno.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) (2021). Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org.