Automated subset identification and characterization pipeline for multidimensional flow and mass cytometry data clustering and visualization

Commun Biol. 2019 Jun 20:2:229. doi: 10.1038/s42003-019-0467-6. eCollection 2019.


When examining datasets of any dimensionality, researchers frequently aim to identify individual subsets (clusters) of objects within the dataset. The ubiquity of multidimensional data has motivated the replacement of user-guided clustering with fully automated clustering. The fully automated methods are designed to make clustering more accurate, standardized and faster. However, the adoption of these methods is still limited by the lack of intuitive visualization and cluster matching methods that would allow users to readily interpret fully automatically generated clusters. To address these issues, we developed a fully automated subset identification and characterization (SIC) pipeline providing robust cluster matching and data visualization tools for high-dimensional flow/mass cytometry (and other) data. This pipeline automatically (and intuitively) generates two-dimensional representations of high-dimensional datasets that are safe from the curse of dimensionality. This new approach allows more robust and reproducible data analysis,+ facilitating the development of new gold standard practices across laboratories and institutions.

Keywords: Computational platforms and environments; Statistical methods.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Biomarkers, Tumor / blood
  • Bone Marrow Cells
  • Cluster Analysis*
  • Data Visualization*
  • Flow Cytometry / methods*
  • Humans
  • Leukemia, Myeloid, Acute / blood
  • Lymphocytes / cytology
  • Mice, Inbred BALB C
  • Mice, Inbred C57BL
  • Mice, Knockout
  • Myeloid Cells / cytology
  • Pattern Recognition, Automated / methods*
  • Peritoneal Cavity / cytology


  • Biomarkers, Tumor